calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5764

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4238 1.0 6 2.7523
2.3982 2.0 12 2.1138
1.8535 3.0 18 1.8031
1.6989 4.0 24 1.9360
1.5938 5.0 30 1.6749
1.5256 6.0 36 1.6441
1.4844 7.0 42 1.7053
1.4458 8.0 48 1.6526
1.4297 9.0 54 1.8445
1.3608 10.0 60 1.8894
1.2888 11.0 66 2.1697
1.2656 12.0 72 2.0470
1.2222 13.0 78 2.0777
1.1678 14.0 84 2.1063
1.1038 15.0 90 2.0946
1.0625 16.0 96 2.0211
1.0258 17.0 102 2.1410
1.0500 18.0 108 2.3152
1.0602 19.0 114 2.3428
1.0098 20.0 120 2.2889
1.0509 21.0 126 2.2918
0.9735 22.0 132 2.3569
0.8998 23.0 138 2.4122
0.8866 24.0 144 2.5343
0.8842 25.0 150 2.5096
0.8412 26.0 156 2.5483
0.8250 27.0 162 2.4582
0.8417 28.0 168 2.5274
0.7937 29.0 174 2.5061
0.7735 30.0 180 2.6067
0.7456 31.0 186 2.5956
0.7543 32.0 192 2.5598
0.7232 33.0 198 2.5147
0.7093 34.0 204 2.5534
0.7012 35.0 210 2.6147
0.6985 36.0 216 2.5596
0.6898 37.0 222 2.6099
0.6722 38.0 228 2.5622
0.6785 39.0 234 2.5618
0.6685 40.0 240 2.5764

Framework versions

  • Transformers 5.2.0
  • Pytorch 2.10.0
  • Datasets 4.5.0
  • Tokenizers 0.22.2
Downloads last month
19
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support