calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0854

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
2.9372 1.0 6 2.2670
2.0312 2.0 12 1.7752
1.5891 3.0 18 1.3654
1.2485 4.0 24 1.1107
1.0668 5.0 30 1.0103
0.9308 6.0 36 0.8701
0.8106 7.0 42 0.7530
0.7096 8.0 48 0.7189
0.6817 9.0 54 0.6854
0.6489 10.0 60 0.6492
0.6130 11.0 66 0.5677
0.5748 12.0 72 0.5398
0.5358 13.0 78 0.5457
0.5137 14.0 84 0.5096
0.4751 15.0 90 0.4752
0.4416 16.0 96 0.4274
0.4043 17.0 102 0.3834
0.3843 18.0 108 0.3572
0.3639 19.0 114 0.3411
0.3399 20.0 120 0.3274
0.3107 21.0 126 0.2839
0.2773 22.0 132 0.2631
0.2583 23.0 138 0.2441
0.2466 24.0 144 0.2110
0.2229 25.0 150 0.1905
0.2093 26.0 156 0.1869
0.2033 27.0 162 0.1731
0.1819 28.0 168 0.1634
0.1751 29.0 174 0.1435
0.1590 30.0 180 0.1256
0.1545 31.0 186 0.1279
0.1409 32.0 192 0.1184
0.1319 33.0 198 0.1073
0.1222 34.0 204 0.1036
0.1270 35.0 210 0.0979
0.1177 36.0 216 0.0943
0.1176 37.0 222 0.0900
0.1087 38.0 228 0.0880
0.1039 39.0 234 0.0862
0.1047 40.0 240 0.0854

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
15
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support