wtd's picture
End of training
8a33c3e verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4045 1.0 6 2.7587
2.3917 2.0 12 1.9900
1.8734 3.0 18 1.6958
1.6324 4.0 24 1.6081
1.5676 5.0 30 1.5619
1.5436 6.0 36 1.6197
1.5139 7.0 42 1.4991
1.4614 8.0 48 1.4779
1.4407 9.0 54 1.4234
1.3644 10.0 60 1.3460
1.3096 11.0 66 1.3823
1.2634 12.0 72 1.2711
1.1912 13.0 78 1.2382
1.1856 14.0 84 1.1337
1.1019 15.0 90 1.2100
1.1441 16.0 96 1.1382
1.0611 17.0 102 1.0282
0.9967 18.0 108 0.9920
0.9765 19.0 114 0.9946
0.9517 20.0 120 0.9478
0.9374 21.0 126 0.9441
0.8931 22.0 132 0.9748
0.8756 23.0 138 0.8511
0.8523 24.0 144 0.8759
0.8757 25.0 150 0.8253
0.8209 26.0 156 0.8182
0.8190 27.0 162 0.7820
0.7795 28.0 168 0.7740
0.8097 29.0 174 0.7571
0.7626 30.0 180 0.7584
0.7491 31.0 186 0.7444
0.7320 32.0 192 0.7177
0.7235 33.0 198 0.7124
0.7145 34.0 204 0.7032
0.7085 35.0 210 0.6888
0.7138 36.0 216 0.6866
0.6910 37.0 222 0.6789
0.6801 38.0 228 0.6731
0.6819 39.0 234 0.6715
0.6750 40.0 240 0.6688

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2