GSM8K-Binary_Llama-3.2-1B-qjearsjc

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3357
  • Model Preparation Time: 0.0056
  • Mdl: 4769.2624
  • Accumulated Loss: 3305.8008
  • Correct Preds: 1535.0
  • Total Preds: 2475.0
  • Accuracy: 0.6202
  • Correct Gen Preds: 462.0
  • Gen Accuracy: 0.1867
  • Correct Gen Preds 34192: 259.0
  • Correct Preds 34192: 766.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.6405
  • Gen Accuracy 34192: 0.2166
  • Correct Gen Preds 41568: 195.0
  • Correct Preds 41568: 769.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.6069
  • Gen Accuracy 41568: 0.1539

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0056 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
2.159 1.0 2 2.4593 0.0056 8781.1889 6086.6563 1266.0 2475.0 0.5115 1271.0 0.5135 0.0 0.0 1196.0 0.0 0.0 1263.0 1266.0 1267.0 0.9992 0.9968
0.9326 2.0 4 2.9664 0.0056 10591.9006 7341.7460 1196.0 2475.0 0.4832 8.0 0.0032 0.0 1196.0 1196.0 1.0 0.0 0.0 0.0 1267.0 0.0 0.0
0.934 3.0 6 0.8287 0.0056 2959.1705 2051.1407 1223.0 2475.0 0.4941 7.0 0.0028 0.0 1144.0 1196.0 0.9565 0.0 0.0 79.0 1267.0 0.0624 0.0
1.266 4.0 8 0.8432 0.0056 3010.6868 2086.8491 1266.0 2475.0 0.5115 8.0 0.0032 0.0 3.0 1196.0 0.0025 0.0 0.0 1263.0 1267.0 0.9968 0.0
1.1255 5.0 10 0.7617 0.0056 2719.6432 1885.1130 1292.0 2475.0 0.5220 8.0 0.0032 0.0 1120.0 1196.0 0.9365 0.0 0.0 172.0 1267.0 0.1358 0.0
0.5485 6.0 12 0.7261 0.0056 2592.7188 1797.1357 1434.0 2475.0 0.5794 8.0 0.0032 0.0 576.0 1196.0 0.4816 0.0 0.0 858.0 1267.0 0.6772 0.0
0.774 7.0 14 0.7111 0.0056 2538.9261 1759.8495 1522.0 2475.0 0.6149 8.0 0.0032 0.0 731.0 1196.0 0.6112 0.0 0.0 791.0 1267.0 0.6243 0.0
0.7172 8.0 16 0.8789 0.0056 3138.2628 2175.2780 1486.0 2475.0 0.6004 8.0 0.0032 0.0 587.0 1196.0 0.4908 0.0 0.0 899.0 1267.0 0.7096 0.0
0.0124 9.0 18 1.0022 0.0056 3578.4299 2480.3786 1519.0 2475.0 0.6137 58.0 0.0234 31.0 607.0 1196.0 0.5075 0.0259 19.0 912.0 1267.0 0.7198 0.0150
0.5132 10.0 20 1.3357 0.0056 4769.2624 3305.8008 1535.0 2475.0 0.6202 462.0 0.1867 259.0 766.0 1196.0 0.6405 0.2166 195.0 769.0 1267.0 0.6069 0.1539
0.5116 11.0 22 1.7177 0.0056 6133.3370 4251.3052 1510.0 2475.0 0.6101 958.0 0.3871 432.0 726.0 1196.0 0.6070 0.3612 517.0 784.0 1267.0 0.6188 0.4081
0.5115 12.0 24 2.0096 0.0056 7175.7009 4973.8169 1505.0 2475.0 0.6081 1186.0 0.4792 510.0 694.0 1196.0 0.5803 0.4264 667.0 811.0 1267.0 0.6401 0.5264
0.0 13.0 26 2.2179 0.0056 7919.3806 5489.2963 1481.0 2475.0 0.5984 1283.0 0.5184 546.0 662.0 1196.0 0.5535 0.4565 728.0 819.0 1267.0 0.6464 0.5746
0.0 14.0 28 2.3460 0.0056 8376.7131 5806.2950 1486.0 2475.0 0.6004 1339.0 0.5410 565.0 649.0 1196.0 0.5426 0.4724 765.0 837.0 1267.0 0.6606 0.6038
0.0 15.0 30 2.4274 0.0056 8667.6229 6007.9384 1483.0 2475.0 0.5992 1361.0 0.5499 574.0 643.0 1196.0 0.5376 0.4799 778.0 840.0 1267.0 0.6630 0.6140
0.5114 16.0 32 2.4736 0.0056 8832.2736 6122.0655 1480.0 2475.0 0.5980 1385.0 0.5596 586.0 643.0 1196.0 0.5376 0.4900 790.0 837.0 1267.0 0.6606 0.6235
0.0 17.0 34 2.4918 0.0056 8897.3192 6167.1517 1479.0 2475.0 0.5976 1396.0 0.5640 596.0 646.0 1196.0 0.5401 0.4983 791.0 833.0 1267.0 0.6575 0.6243
0.5114 18.0 36 2.5053 0.0056 8945.5765 6200.6011 1476.0 2475.0 0.5964 1399.0 0.5653 600.0 647.0 1196.0 0.5410 0.5017 790.0 829.0 1267.0 0.6543 0.6235
0.0 19.0 38 2.5135 0.0056 8975.0189 6221.0090 1479.0 2475.0 0.5976 1404.0 0.5673 606.0 651.0 1196.0 0.5443 0.5067 789.0 828.0 1267.0 0.6535 0.6227
0.5114 20.0 40 2.5146 0.0056 8978.6771 6223.5447 1481.0 2475.0 0.5984 1410.0 0.5697 613.0 658.0 1196.0 0.5502 0.5125 788.0 823.0 1267.0 0.6496 0.6219
0.5114 21.0 42 2.5164 0.0056 8985.2387 6228.0929 1481.0 2475.0 0.5984 1414.0 0.5713 619.0 662.0 1196.0 0.5535 0.5176 786.0 819.0 1267.0 0.6464 0.6204
0.0 22.0 44 2.5151 0.0056 8980.5739 6224.8595 1485.0 2475.0 0.6 1419.0 0.5733 625.0 665.0 1196.0 0.5560 0.5226 785.0 820.0 1267.0 0.6472 0.6196
0.0 23.0 46 2.5157 0.0056 8982.8389 6226.4294 1478.0 2475.0 0.5972 1415.0 0.5717 623.0 663.0 1196.0 0.5543 0.5209 783.0 815.0 1267.0 0.6433 0.6180
0.0 24.0 48 2.5154 0.0056 8981.6130 6225.5797 1489.0 2475.0 0.6016 1421.0 0.5741 629.0 670.0 1196.0 0.5602 0.5259 783.0 819.0 1267.0 0.6464 0.6180
0.0 25.0 50 2.5149 0.0056 8979.8291 6224.3432 1483.0 2475.0 0.5992 1418.0 0.5729 625.0 668.0 1196.0 0.5585 0.5226 784.0 815.0 1267.0 0.6433 0.6188
0.5114 26.0 52 2.5138 0.0056 8975.9674 6221.6665 1485.0 2475.0 0.6 1422.0 0.5745 631.0 672.0 1196.0 0.5619 0.5276 782.0 813.0 1267.0 0.6417 0.6172
0.5114 27.0 54 2.5126 0.0056 8971.6565 6218.6784 1480.0 2475.0 0.5980 1419.0 0.5733 630.0 667.0 1196.0 0.5577 0.5268 780.0 813.0 1267.0 0.6417 0.6156
0.0 28.0 56 2.5118 0.0056 8968.9510 6216.8031 1487.0 2475.0 0.6008 1424.0 0.5754 630.0 672.0 1196.0 0.5619 0.5268 785.0 815.0 1267.0 0.6433 0.6196
0.5114 29.0 58 2.5114 0.0056 8967.4311 6215.7496 1486.0 2475.0 0.6004 1424.0 0.5754 636.0 675.0 1196.0 0.5644 0.5318 779.0 811.0 1267.0 0.6401 0.6148
0.5114 30.0 60 2.5107 0.0056 8965.0040 6214.0673 1490.0 2475.0 0.6020 1425.0 0.5758 635.0 675.0 1196.0 0.5644 0.5309 781.0 815.0 1267.0 0.6433 0.6164
0.5114 31.0 62 2.5092 0.0056 8959.3649 6210.1585 1491.0 2475.0 0.6024 1427.0 0.5766 636.0 677.0 1196.0 0.5661 0.5318 782.0 814.0 1267.0 0.6425 0.6172
0.5114 32.0 64 2.5104 0.0056 8963.9824 6213.3591 1485.0 2475.0 0.6 1425.0 0.5758 635.0 674.0 1196.0 0.5635 0.5309 781.0 811.0 1267.0 0.6401 0.6164
0.5114 33.0 66 2.5094 0.0056 8960.2537 6210.7746 1487.0 2475.0 0.6008 1428.0 0.5770 639.0 676.0 1196.0 0.5652 0.5343 780.0 811.0 1267.0 0.6401 0.6156
0.0 34.0 68 2.5074 0.0056 8953.1812 6205.8723 1488.0 2475.0 0.6012 1425.0 0.5758 633.0 676.0 1196.0 0.5652 0.5293 783.0 812.0 1267.0 0.6409 0.6180
0.5114 35.0 70 2.5108 0.0056 8965.1034 6214.1362 1489.0 2475.0 0.6016 1427.0 0.5766 636.0 675.0 1196.0 0.5644 0.5318 782.0 814.0 1267.0 0.6425 0.6172
0.0 36.0 72 2.5116 0.0056 8968.2043 6216.2855 1490.0 2475.0 0.6020 1426.0 0.5762 637.0 677.0 1196.0 0.5661 0.5326 780.0 813.0 1267.0 0.6417 0.6156
0.0 37.0 74 2.5119 0.0056 8969.1267 6216.9249 1490.0 2475.0 0.6020 1426.0 0.5762 635.0 677.0 1196.0 0.5661 0.5309 782.0 813.0 1267.0 0.6417 0.6172
0.0 38.0 76 2.5082 0.0056 8955.9512 6207.7924 1490.0 2475.0 0.6020 1429.0 0.5774 639.0 677.0 1196.0 0.5661 0.5343 781.0 813.0 1267.0 0.6417 0.6164
0.0 39.0 78 2.5092 0.0056 8959.5687 6210.2998 1485.0 2475.0 0.6 1423.0 0.5749 633.0 674.0 1196.0 0.5635 0.5293 781.0 811.0 1267.0 0.6401 0.6164
0.0 40.0 80 2.5072 0.0056 8952.5199 6205.4140 1491.0 2475.0 0.6024 1426.0 0.5762 638.0 677.0 1196.0 0.5661 0.5334 779.0 814.0 1267.0 0.6425 0.6148

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-qjearsjc

Finetuned
(908)
this model