calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7290

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0133 1.0 6 2.2590
2.008 2.0 12 1.7059
1.548 3.0 18 1.4082
1.2527 4.0 24 1.0746
1.0272 5.0 30 0.9517
0.8724 6.0 36 0.8435
0.7908 7.0 42 0.7824
0.7088 8.0 48 0.7619
0.6489 9.0 54 0.7522
0.596 10.0 60 0.7539
0.5572 11.0 66 0.7119
0.5232 12.0 72 0.7161
0.4798 13.0 78 0.7047
0.4341 14.0 84 0.7528
0.4263 15.0 90 0.6681
0.3977 16.0 96 0.7356
0.3748 17.0 102 0.7377
0.3609 18.0 108 0.7455
0.3287 19.0 114 0.7199
0.3114 20.0 120 0.7666
0.2855 21.0 126 0.6954
0.2664 22.0 132 0.7641
0.258 23.0 138 0.7126
0.2414 24.0 144 0.7327
0.2281 25.0 150 0.6886
0.2108 26.0 156 0.6910
0.1933 27.0 162 0.7081
0.1907 28.0 168 0.7257
0.1827 29.0 174 0.7252
0.167 30.0 180 0.7151
0.1562 31.0 186 0.7102
0.1473 32.0 192 0.7296
0.149 33.0 198 0.6922
0.1413 34.0 204 0.7064
0.1265 35.0 210 0.7110
0.1235 36.0 216 0.7212
0.1156 37.0 222 0.7290
0.1141 38.0 228 0.7200
0.1099 39.0 234 0.7263
0.1071 40.0 240 0.7290

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.1
  • Datasets 4.6.1
  • Tokenizers 0.22.2
Downloads last month
28
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support