calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1078

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0109 1.0 6 2.2280
2.0109 2.0 12 1.7234
1.5557 3.0 18 1.3273
1.2094 4.0 24 1.0575
1.0379 5.0 30 0.9553
0.9086 6.0 36 0.7990
0.7845 7.0 42 0.7418
0.7194 8.0 48 0.6974
0.6941 9.0 54 0.6425
0.6287 10.0 60 0.5873
0.5727 11.0 66 0.5557
0.5554 12.0 72 0.5182
0.5188 13.0 78 0.4901
0.4958 14.0 84 0.4795
0.4803 15.0 90 0.4511
0.4574 16.0 96 0.4144
0.4339 17.0 102 0.4013
0.4121 18.0 108 0.3858
0.3928 19.0 114 0.3654
0.3788 20.0 120 0.3881
0.3848 21.0 126 0.3479
0.3481 22.0 132 0.3113
0.3370 23.0 138 0.2940
0.3078 24.0 144 0.2849
0.3043 25.0 150 0.2721
0.2817 26.0 156 0.2507
0.2699 27.0 162 0.2288
0.2426 28.0 168 0.2090
0.2266 29.0 174 0.1984
0.2146 30.0 180 0.1885
0.2025 31.0 186 0.1787
0.1990 32.0 192 0.1630
0.1891 33.0 198 0.1490
0.1778 34.0 204 0.1396
0.1613 35.0 210 0.1301
0.1577 36.0 216 0.1250
0.1538 37.0 222 0.1182
0.1483 38.0 228 0.1128
0.1423 39.0 234 0.1095
0.1467 40.0 240 0.1078

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
39
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support