calculator_model_test_with_steps

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0741

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0444 1.0 6 2.2982
2.0319 2.0 12 1.7372
1.5871 3.0 18 1.3782
1.2735 4.0 24 1.1017
1.0332 5.0 30 0.9506
0.9051 6.0 36 0.8307
0.7868 7.0 42 0.7057
0.7026 8.0 48 0.6872
0.6729 9.0 54 0.6115
0.6270 10.0 60 0.6062
0.5786 11.0 66 0.5212
0.5293 12.0 72 0.4763
0.4852 13.0 78 0.4490
0.4583 14.0 84 0.4149
0.4152 15.0 90 0.3562
0.3767 16.0 96 0.3472
0.3715 17.0 102 0.3505
0.3611 18.0 108 0.3096
0.3194 19.0 114 0.2867
0.3072 20.0 120 0.2500
0.2794 21.0 126 0.2270
0.2550 22.0 132 0.2257
0.2486 23.0 138 0.2034
0.2315 24.0 144 0.1956
0.2167 25.0 150 0.1968
0.2169 26.0 156 0.1761
0.1939 27.0 162 0.1711
0.1856 28.0 168 0.1356
0.1744 29.0 174 0.1312
0.1595 30.0 180 0.1211
0.1495 31.0 186 0.1110
0.1418 32.0 192 0.0995
0.1329 33.0 198 0.0956
0.1260 34.0 204 0.0909
0.1162 35.0 210 0.0844
0.1135 36.0 216 0.0794
0.1060 37.0 222 0.0777
0.1105 38.0 228 0.0762
0.1032 39.0 234 0.0746
0.1033 40.0 240 0.0741

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
20
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support