olatyszka's picture
End of training
27f01a7 verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0652

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
2.9455 1.0 5 2.0822
1.8943 2.0 10 1.6374
1.5309 3.0 15 1.3630
1.2901 4.0 20 1.1525
1.0707 5.0 25 0.9432
0.9185 6.0 30 0.8454
0.8187 7.0 35 0.7501
0.7215 8.0 40 0.6560
0.6404 9.0 45 0.5970
0.5953 10.0 50 0.5479
0.5476 11.0 55 0.5130
0.5134 12.0 60 0.4796
0.4773 13.0 65 0.4578
0.4509 14.0 70 0.4150
0.4219 15.0 75 0.3884
0.3962 16.0 80 0.3658
0.3700 17.0 85 0.3397
0.3504 18.0 90 0.3145
0.3254 19.0 95 0.2937
0.3058 20.0 100 0.2701
0.2856 21.0 105 0.2564
0.2637 22.0 110 0.2273
0.2426 23.0 115 0.2088
0.2264 24.0 120 0.1859
0.2062 25.0 125 0.1618
0.1867 26.0 130 0.1333
0.1655 27.0 135 0.1178
0.1550 28.0 140 0.1166
0.1431 29.0 145 0.1050
0.1325 30.0 150 0.0944
0.1235 31.0 155 0.0880
0.1156 32.0 160 0.0855
0.1110 33.0 165 0.0805
0.1057 34.0 170 0.0745
0.1042 35.0 175 0.0722
0.0986 36.0 180 0.0710
0.0970 37.0 185 0.0672
0.0952 38.0 190 0.0666
0.0926 39.0 195 0.0655
0.0935 40.0 200 0.0652

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2