calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0045

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
2.3208 1.0 6 1.7801
1.6167 2.0 12 1.4190
1.3790 3.0 18 1.3787
1.2999 4.0 24 1.2406
1.1367 5.0 30 1.1432
1.0477 6.0 36 0.9924
0.9550 7.0 42 0.9145
0.8568 8.0 48 0.8587
0.8161 9.0 54 0.8325
0.8154 10.0 60 0.8123
0.7740 11.0 66 0.8280
0.7357 12.0 72 0.7068
0.6664 13.0 78 0.6546
0.6176 14.0 84 0.6074
0.5990 15.0 90 0.6042
0.5777 16.0 96 0.6093
0.5371 17.0 102 0.5074
0.4953 18.0 108 0.5070
0.4999 19.0 114 0.5536
0.5230 20.0 120 0.5276
0.4750 21.0 126 0.5726
0.5112 22.0 132 0.4879
0.4367 23.0 138 0.4797
0.4526 24.0 144 0.4519
0.4212 25.0 150 0.3937
0.4345 26.0 156 0.4556
0.4463 27.0 162 0.4319
0.4206 28.0 168 0.4294
0.4098 29.0 174 0.4353
0.4184 30.0 180 0.3689
0.3558 31.0 186 0.3611
0.3784 32.0 192 0.3562
0.3416 33.0 198 0.3633
0.3587 34.0 204 0.2998
0.3020 35.0 210 0.2746
0.2803 36.0 216 0.2586
0.2968 37.0 222 0.2734
0.2725 38.0 228 0.3669
0.3261 39.0 234 0.2672
0.2693 40.0 240 0.2603
0.3001 41.0 246 0.2625
0.2979 42.0 252 0.2724
0.2688 43.0 258 0.2563
0.2705 44.0 264 0.2068
0.2271 45.0 270 0.1919
0.2181 46.0 276 0.2369
0.2450 47.0 282 0.2518
0.2451 48.0 288 0.2630
0.3311 49.0 294 0.1948
0.2112 50.0 300 0.2220
0.2408 51.0 306 0.2290
0.2484 52.0 312 0.2001
0.2117 53.0 318 0.2169
0.2254 54.0 324 0.1979
0.2088 55.0 330 0.1925
0.2027 56.0 336 0.1754
0.1910 57.0 342 0.1389
0.1745 58.0 348 0.1300
0.1657 59.0 354 0.1269
0.1711 60.0 360 0.1430
0.1848 61.0 366 0.1232
0.1501 62.0 372 0.1050
0.1295 63.0 378 0.0914
0.1303 64.0 384 0.0986
0.1140 65.0 390 0.0799
0.1123 66.0 396 0.0995
0.1164 67.0 402 0.0903
0.1110 68.0 408 0.0899
0.1117 69.0 414 0.0973
0.1011 70.0 420 0.1054
0.1133 71.0 426 0.0858
0.0953 72.0 432 0.1040
0.1117 73.0 438 0.1025
0.1328 74.0 444 0.0957
0.1041 75.0 450 0.0897
0.1003 76.0 456 0.0712
0.0894 77.0 462 0.0774
0.0899 78.0 468 0.0680
0.0873 79.0 474 0.0749
0.1007 80.0 480 0.0679
0.0873 81.0 486 0.0692
0.0876 82.0 492 0.0897
0.1011 83.0 498 0.0743
0.0887 84.0 504 0.0714
0.0938 85.0 510 0.0623
0.0806 86.0 516 0.0704
0.0914 87.0 522 0.0540
0.0665 88.0 528 0.0421
0.0664 89.0 534 0.0429
0.0671 90.0 540 0.0366
0.0520 91.0 546 0.0301
0.0499 92.0 552 0.0278
0.0473 93.0 558 0.0305
0.0395 94.0 564 0.0244
0.0394 95.0 570 0.0589
0.0686 96.0 576 0.0294
0.0399 97.0 582 0.0388
0.0385 98.0 588 0.0144
0.0362 99.0 594 0.0128
0.0339 100.0 600 0.0140
0.0280 101.0 606 0.0172
0.0227 102.0 612 0.0222
0.0478 103.0 618 0.0100
0.0211 104.0 624 0.0095
0.0184 105.0 630 0.0078
0.0156 106.0 636 0.0088
0.0149 107.0 642 0.0067
0.0113 108.0 648 0.0059
0.0098 109.0 654 0.0052
0.0098 110.0 660 0.0076
0.0098 111.0 666 0.0065
0.0075 112.0 672 0.0072
0.0075 113.0 678 0.0054
0.0065 114.0 684 0.0069
0.0112 115.0 690 0.0062
0.0144 116.0 696 0.0057
0.0298 117.0 702 0.0093
0.0179 118.0 708 0.0088
0.0140 119.0 714 0.0071
0.0086 120.0 720 0.0060
0.0076 121.0 726 0.0035
0.0066 122.0 732 0.0040
0.0057 123.0 738 0.0036
0.0052 124.0 744 0.0028
0.0050 125.0 750 0.0034
0.0055 126.0 756 0.0076
0.0060 127.0 762 0.0029
0.0074 128.0 768 0.0040
0.0117 129.0 774 0.0065
0.0073 130.0 780 0.0056
0.0061 131.0 786 0.0048
0.0048 132.0 792 0.0050
0.0039 133.0 798 0.0043
0.0039 134.0 804 0.0042
0.0040 135.0 810 0.0049
0.0042 136.0 816 0.0047
0.0034 137.0 822 0.0040
0.0033 138.0 828 0.0038
0.0042 139.0 834 0.0037
0.0105 140.0 840 0.0043
0.0118 141.0 846 0.0042
0.0101 142.0 852 0.0026
0.0102 143.0 858 0.0049
0.0087 144.0 864 0.0048
0.0112 145.0 870 0.0039
0.0081 146.0 876 0.0039
0.0109 147.0 882 0.0033
0.0071 148.0 888 0.0029
0.0035 149.0 894 0.0039
0.0039 150.0 900 0.0035
0.0034 151.0 906 0.0033
0.0027 152.0 912 0.0035
0.0030 153.0 918 0.0050
0.0032 154.0 924 0.0073
0.0033 155.0 930 0.0067
0.0023 156.0 936 0.0051
0.0023 157.0 942 0.0038
0.0025 158.0 948 0.0027
0.0022 159.0 954 0.0031
0.0025 160.0 960 0.0037
0.0045 161.0 966 0.0035
0.0049 162.0 972 0.0053
0.0045 163.0 978 0.0046
0.0041 164.0 984 0.0054
0.0032 165.0 990 0.0055
0.0026 166.0 996 0.0049
0.0031 167.0 1002 0.0044
0.0043 168.0 1008 0.0039
0.0048 169.0 1014 0.0042
0.0051 170.0 1020 0.0030
0.0045 171.0 1026 0.0072
0.0080 172.0 1032 0.0047
0.0033 173.0 1038 0.0039
0.0034 174.0 1044 0.0043
0.0026 175.0 1050 0.0047
0.0026 176.0 1056 0.0049
0.0027 177.0 1062 0.0047
0.0021 178.0 1068 0.0044
0.0018 179.0 1074 0.0044
0.0019 180.0 1080 0.0042
0.0021 181.0 1086 0.0047
0.0020 182.0 1092 0.0054
0.0017 183.0 1098 0.0056
0.0018 184.0 1104 0.0053
0.0019 185.0 1110 0.0049
0.0016 186.0 1116 0.0048
0.0019 187.0 1122 0.0048
0.0020 188.0 1128 0.0047
0.0015 189.0 1134 0.0045
0.0024 190.0 1140 0.0045
0.0013 191.0 1146 0.0045
0.0017 192.0 1152 0.0046
0.0018 193.0 1158 0.0047
0.0013 194.0 1164 0.0047
0.0014 195.0 1170 0.0047
0.0014 196.0 1176 0.0047
0.0016 197.0 1182 0.0046
0.0013 198.0 1188 0.0046
0.0016 199.0 1194 0.0045
0.0016 200.0 1200 0.0045

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
107
Safetensors
Model size
7.78M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support