quantumLeopard's picture
End of training
fa974b2 verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1301

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.1429 1.0 4 2.2337
2.0240 2.0 8 1.7621
1.6595 3.0 12 1.4806
1.3784 4.0 16 1.1427
1.0363 5.0 20 0.8139
0.7381 6.0 24 0.6046
0.5807 7.0 28 0.5287
0.5117 8.0 32 0.4774
0.4641 9.0 36 0.4449
0.4274 10.0 40 0.4155
0.3970 11.0 44 0.3787
0.3664 12.0 48 0.3443
0.3391 13.0 52 0.3224
0.3196 14.0 56 0.3062
0.3033 15.0 60 0.2950
0.2938 16.0 64 0.2804
0.2757 17.0 68 0.2682
0.2641 18.0 72 0.2580
0.2551 19.0 76 0.2474
0.2441 20.0 80 0.2430
0.2358 21.0 84 0.2323
0.2284 22.0 88 0.2258
0.2149 23.0 92 0.2129
0.2098 24.0 96 0.2078
0.2005 25.0 100 0.1975
0.1898 26.0 104 0.1901
0.1845 27.0 108 0.1790
0.1771 28.0 112 0.1746
0.1708 29.0 116 0.1668
0.1657 30.0 120 0.1610
0.1600 31.0 124 0.1581
0.1559 32.0 128 0.1510
0.1498 33.0 132 0.1475
0.1451 34.0 136 0.1432
0.1426 35.0 140 0.1417
0.1373 36.0 144 0.1365
0.1334 37.0 148 0.1345
0.1335 38.0 152 0.1318
0.1309 39.0 156 0.1303
0.1308 40.0 160 0.1301

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2