train_cola_1744902674

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1450
  • Num Input Tokens Seen: 30508240

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1957 0.4158 200 0.1817 153120
0.1135 0.8316 400 0.1695 305504
0.1838 1.2474 600 0.1719 458648
0.1714 1.6632 800 0.1582 610680
0.1598 2.0790 1000 0.1652 763880
0.1457 2.4948 1200 0.1684 916648
0.1318 2.9106 1400 0.1617 1068552
0.1254 3.3264 1600 0.1538 1220928
0.1177 3.7422 1800 0.1547 1373952
0.1587 4.1580 2000 0.1559 1526312
0.1062 4.5738 2200 0.1582 1678248
0.139 4.9896 2400 0.1594 1831112
0.0888 5.4054 2600 0.1484 1983296
0.1192 5.8212 2800 0.1509 2135968
0.1676 6.2370 3000 0.1498 2289200
0.0981 6.6528 3200 0.1545 2441648
0.1514 7.0686 3400 0.1543 2593344
0.0984 7.4844 3600 0.1450 2745792
0.0861 7.9002 3800 0.1477 2898816
0.1052 8.3160 4000 0.1481 3050480
0.1181 8.7318 4200 0.1555 3202864
0.0822 9.1476 4400 0.1496 3355680
0.1244 9.5634 4600 0.1713 3508192
0.1002 9.9792 4800 0.1487 3661568
0.0598 10.3950 5000 0.1683 3813552
0.0772 10.8108 5200 0.1500 3967024
0.0858 11.2266 5400 0.1604 4120032
0.193 11.6424 5600 0.1741 4272608
0.108 12.0582 5800 0.1718 4424280
0.0874 12.4740 6000 0.1606 4575480
0.0524 12.8898 6200 0.1681 4728792
0.064 13.3056 6400 0.1905 4880880
0.0944 13.7214 6600 0.1584 5034608
0.0552 14.1372 6800 0.1709 5186400
0.0498 14.5530 7000 0.1980 5339008
0.0986 14.9688 7200 0.1659 5491424
0.0387 15.3846 7400 0.1943 5644520
0.0439 15.8004 7600 0.1749 5796744
0.0442 16.2162 7800 0.2338 5949536
0.1374 16.6320 8000 0.2452 6102304
0.0437 17.0478 8200 0.2341 6254288
0.0768 17.4636 8400 0.2073 6407504
0.0372 17.8794 8600 0.2091 6559760
0.1061 18.2952 8800 0.2211 6711968
0.0459 18.7110 9000 0.2630 6864736
0.0671 19.1268 9200 0.2293 7016944
0.0401 19.5426 9400 0.2356 7169456
0.0318 19.9584 9600 0.2766 7322736
0.0566 20.3742 9800 0.2679 7474848
0.0274 20.7900 10000 0.2942 7627360
0.087 21.2058 10200 0.2988 7779952
0.055 21.6216 10400 0.2842 7932848
0.0222 22.0374 10600 0.2714 8085448
0.0417 22.4532 10800 0.3261 8237768
0.0358 22.8690 11000 0.2791 8390664
0.0325 23.2848 11200 0.3150 8543280
0.0052 23.7006 11400 0.3346 8696432
0.0932 24.1164 11600 0.3394 8849408
0.0061 24.5322 11800 0.3440 9001408
0.0246 24.9480 12000 0.3293 9153696
0.0253 25.3638 12200 0.3331 9307088
0.026 25.7796 12400 0.3708 9459824
0.0036 26.1954 12600 0.3640 9611704
0.0035 26.6112 12800 0.3401 9764344
0.0368 27.0270 13000 0.3367 9917064
0.0031 27.4428 13200 0.4020 10068520
0.0017 27.8586 13400 0.3679 10221224
0.0018 28.2744 13600 0.3864 10373912
0.0012 28.6902 13800 0.4108 10526808
0.003 29.1060 14000 0.3892 10678976
0.0045 29.5218 14200 0.3954 10831520
0.0188 29.9376 14400 0.4060 10984224
0.0022 30.3534 14600 0.4303 11135896
0.0027 30.7692 14800 0.4427 11288728
0.002 31.1850 15000 0.4246 11441040
0.0021 31.6008 15200 0.4266 11593456
0.0433 32.0166 15400 0.4899 11745744
0.001 32.4324 15600 0.4568 11898672
0.0025 32.8482 15800 0.5007 12050992
0.0293 33.2640 16000 0.5254 12204352
0.0269 33.6798 16200 0.5383 12356224
0.002 34.0956 16400 0.5557 12507960
0.0355 34.5114 16600 0.5490 12660760
0.0007 34.9272 16800 0.5680 12813272
0.0035 35.3430 17000 0.5824 12965896
0.0293 35.7588 17200 0.6039 13118824
0.0007 36.1746 17400 0.6206 13271872
0.0128 36.5904 17600 0.6462 13424128
0.0086 37.0062 17800 0.6276 13576056
0.0139 37.4220 18000 0.6350 13728696
0.0007 37.8378 18200 0.6730 13881368
0.0002 38.2536 18400 0.6929 14033616
0.0468 38.6694 18600 0.6921 14185616
0.0251 39.0852 18800 0.7073 14338720
0.0 39.5010 19000 0.7611 14490240
0.0028 39.9168 19200 0.7695 14643072
0.0001 40.3326 19400 0.7628 14795184
0.0005 40.7484 19600 0.7207 14947312
0.0 41.1642 19800 0.7724 15100336
0.0495 41.5800 20000 0.7625 15252464
0.0002 41.9958 20200 0.8545 15404912
0.0001 42.4116 20400 0.8109 15557176
0.0252 42.8274 20600 0.7835 15709912
0.0001 43.2432 20800 0.8044 15862336
0.0 43.6590 21000 0.8274 16014304
0.0004 44.0748 21200 0.8358 16166680
0.0231 44.4906 21400 0.8041 16320408
0.0004 44.9064 21600 0.8178 16472888
0.0003 45.3222 21800 0.8642 16625808
0.0001 45.7380 22000 0.8394 16778288
0.0001 46.1538 22200 0.8546 16931560
0.0 46.5696 22400 0.8646 17083880
0.0003 46.9854 22600 0.8434 17235976
0.0 47.4012 22800 0.8887 17388152
0.0 47.8170 23000 0.8348 17540824
0.0009 48.2328 23200 0.8680 17693912
0.0039 48.6486 23400 0.8540 17846296
0.0 49.0644 23600 0.8674 17998760
0.0002 49.4802 23800 0.8551 18152072
0.0456 49.8960 24000 0.8905 18304072
0.0 50.3119 24200 0.8950 18455696
0.0 50.7277 24400 0.9257 18608976
0.0 51.1435 24600 0.8666 18760928
0.0 51.5593 24800 0.8926 18913856
0.0001 51.9751 25000 0.8867 19066528
0.0271 52.3909 25200 0.8797 19218616
0.0 52.8067 25400 0.8724 19370872
0.0 53.2225 25600 0.8797 19524232
0.0 53.6383 25800 0.8288 19676456
0.0282 54.0541 26000 0.8787 19828504
0.0054 54.4699 26200 0.8743 19980856
0.0343 54.8857 26400 0.8487 20133784
0.0101 55.3015 26600 0.8790 20286120
0.0 55.7173 26800 0.8435 20439016
0.0001 56.1331 27000 0.8624 20591320
0.0 56.5489 27200 0.8957 20743736
0.0002 56.9647 27400 0.8590 20896184
0.0 57.3805 27600 0.8863 21049160
0.0 57.7963 27800 0.8608 21201640
0.0 58.2121 28000 0.8635 21354208
0.0001 58.6279 28200 0.8397 21506752
0.0 59.0437 28400 0.8804 21659696
0.0 59.4595 28600 0.8637 21811600
0.0 59.8753 28800 0.8831 21964272
0.0 60.2911 29000 0.8396 22116648
0.0 60.7069 29200 0.8828 22269032
0.0002 61.1227 29400 0.9062 22421944
0.0 61.5385 29600 0.8913 22574936
0.0001 61.9543 29800 0.8643 22727064
0.0 62.3701 30000 0.8615 22880256
0.0022 62.7859 30200 0.8683 23032800
0.0 63.2017 30400 0.8566 23184744
0.0001 63.6175 30600 0.8671 23336904
0.0 64.0333 30800 0.8533 23489432
0.0001 64.4491 31000 0.8689 23641496
0.0 64.8649 31200 0.8734 23794744
0.0 65.2807 31400 0.8683 23947688
0.0047 65.6965 31600 0.8709 24099432
0.0004 66.1123 31800 0.8824 24251200
0.0 66.5281 32000 0.8991 24404736
0.0064 66.9439 32200 0.8599 24557120
0.0 67.3597 32400 0.8702 24709616
0.0 67.7755 32600 0.8736 24862224
0.0 68.1913 32800 0.8590 25015296
0.0001 68.6071 33000 0.8721 25167744
0.0 69.0229 33200 0.8601 25321016
0.0022 69.4387 33400 0.8809 25473368
0.0 69.8545 33600 0.8834 25626520
0.0016 70.2703 33800 0.8706 25778248
0.0001 70.6861 34000 0.8782 25930920
0.0 71.1019 34200 0.8792 26083456
0.0 71.5177 34400 0.9009 26235552
0.0 71.9335 34600 0.8789 26388832
0.0 72.3493 34800 0.8802 26541680
0.0 72.7651 35000 0.8647 26694832
0.0014 73.1809 35200 0.8723 26847168
0.0 73.5967 35400 0.8574 27000096
0.0 74.0125 35600 0.8642 27151800
0.0143 74.4283 35800 0.8676 27304152
0.0 74.8441 36000 0.8728 27456856
0.0 75.2599 36200 0.8842 27610376
0.0 75.6757 36400 0.8783 27762984
0.0 76.0915 36600 0.8702 27915504
0.0059 76.5073 36800 0.8630 28068432
0.0022 76.9231 37000 0.8839 28220720
0.0 77.3389 37200 0.8861 28373600
0.0 77.7547 37400 0.8690 28526304
0.0042 78.1705 37600 0.8750 28678672
0.0 78.5863 37800 0.8820 28831632
0.0 79.0021 38000 0.8786 28983144
0.0 79.4179 38200 0.8864 29136008
0.0 79.8337 38400 0.8769 29288104
0.0 80.2495 38600 0.8865 29440312
0.0049 80.6653 38800 0.8902 29592888
0.0 81.0811 39000 0.8877 29745320
0.0 81.4969 39200 0.8789 29898600
0.0 81.9127 39400 0.8734 30050504
0.0059 82.3285 39600 0.8737 30203576
0.0 82.7443 39800 0.8784 30356408
0.0 83.1601 40000 0.8806 30508240

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1744902674

Adapter
(2101)
this model

Evaluation results