train_cola_1744902673

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1251
  • Num Input Tokens Seen: 30508240

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1566 0.4158 200 0.1606 153120
0.1063 0.8316 400 0.1401 305504
0.1544 1.2474 600 0.1582 458648
0.1243 1.6632 800 0.1251 610680
0.0686 2.0790 1000 0.1440 763880
0.0592 2.4948 1200 0.1684 916648
0.0938 2.9106 1400 0.1308 1068552
0.0571 3.3264 1600 0.1844 1220928
0.0262 3.7422 1800 0.1791 1373952
0.0496 4.1580 2000 0.1723 1526312
0.0438 4.5738 2200 0.1594 1678248
0.0843 4.9896 2400 0.2160 1831112
0.0215 5.4054 2600 0.2138 1983296
0.023 5.8212 2800 0.2471 2135968
0.0004 6.2370 3000 0.2878 2289200
0.0528 6.6528 3200 0.2229 2441648
0.0005 7.0686 3400 0.2865 2593344
0.0339 7.4844 3600 0.2519 2745792
0.0271 7.9002 3800 0.3225 2898816
0.0228 8.3160 4000 0.1990 3050480
0.023 8.7318 4200 0.2724 3202864
0.0322 9.1476 4400 0.2826 3355680
0.0199 9.5634 4600 0.3914 3508192
0.0205 9.9792 4800 0.2709 3661568
0.0001 10.3950 5000 0.3594 3813552
0.0004 10.8108 5200 0.3660 3967024
0.0001 11.2266 5400 0.3095 4120032
0.0551 11.6424 5600 0.3169 4272608
0.0001 12.0582 5800 0.3245 4424280
0.0001 12.4740 6000 0.3536 4575480
0.0035 12.8898 6200 0.2995 4728792
0.0532 13.3056 6400 0.3793 4880880
0.006 13.7214 6600 0.3337 5034608
0.0002 14.1372 6800 0.3331 5186400
0.0433 14.5530 7000 0.3456 5339008
0.0002 14.9688 7200 0.3547 5491424
0.0081 15.3846 7400 0.2288 5644520
0.0002 15.8004 7600 0.3016 5796744
0.0275 16.2162 7800 0.4400 5949536
0.0122 16.6320 8000 0.3850 6102304
0.005 17.0478 8200 0.2611 6254288
0.0004 17.4636 8400 0.2895 6407504
0.0004 17.8794 8600 0.3064 6559760
0.0 18.2952 8800 0.3379 6711968
0.0001 18.7110 9000 0.3845 6864736
0.0238 19.1268 9200 0.2616 7016944
0.0117 19.5426 9400 0.3396 7169456
0.0044 19.9584 9600 0.3177 7322736
0.0085 20.3742 9800 0.3556 7474848
0.0 20.7900 10000 0.5539 7627360
0.0004 21.2058 10200 0.3149 7779952
0.0026 21.6216 10400 0.3706 7932848
0.0062 22.0374 10600 0.3555 8085448
0.0005 22.4532 10800 0.3528 8237768
0.0 22.8690 11000 0.3308 8390664
0.0 23.2848 11200 0.5050 8543280
0.0 23.7006 11400 0.5456 8696432
0.0086 24.1164 11600 0.3712 8849408
0.0577 24.5322 11800 0.4017 9001408
0.0002 24.9480 12000 0.3498 9153696
0.0 25.3638 12200 0.3867 9307088
0.0001 25.7796 12400 0.3254 9459824
0.0001 26.1954 12600 0.3586 9611704
0.0 26.6112 12800 0.4128 9764344
0.0107 27.0270 13000 0.2730 9917064
0.0 27.4428 13200 0.3592 10068520
0.0 27.8586 13400 0.4470 10221224
0.0385 28.2744 13600 0.4294 10373912
0.0001 28.6902 13800 0.3679 10526808
0.0006 29.1060 14000 0.2722 10678976
0.0045 29.5218 14200 0.3160 10831520
0.0056 29.9376 14400 0.2948 10984224
0.0001 30.3534 14600 0.3797 11135896
0.0001 30.7692 14800 0.3789 11288728
0.0 31.1850 15000 0.3670 11441040
0.0001 31.6008 15200 0.3541 11593456
0.0035 32.0166 15400 0.4322 11745744
0.0001 32.4324 15600 0.3771 11898672
0.0002 32.8482 15800 0.3758 12050992
0.0 33.2640 16000 0.5030 12204352
0.0001 33.6798 16200 0.3608 12356224
0.0002 34.0956 16400 0.3631 12507960
0.0 34.5114 16600 0.4400 12660760
0.0 34.9272 16800 0.3928 12813272
0.0 35.3430 17000 0.4684 12965896
0.0 35.7588 17200 0.3981 13118824
0.0489 36.1746 17400 0.3710 13271872
0.0001 36.5904 17600 0.3502 13424128
0.0036 37.0062 17800 0.3781 13576056
0.0 37.4220 18000 0.4090 13728696
0.0 37.8378 18200 0.4388 13881368
0.0 38.2536 18400 0.4523 14033616
0.0032 38.6694 18600 0.4666 14185616
0.0 39.0852 18800 0.4749 14338720
0.0 39.5010 19000 0.4786 14490240
0.0 39.9168 19200 0.4887 14643072
0.0 40.3326 19400 0.4982 14795184
0.0 40.7484 19600 0.4999 14947312
0.0 41.1642 19800 0.4933 15100336
0.0035 41.5800 20000 0.5195 15252464
0.0 41.9958 20200 0.5241 15404912
0.0 42.4116 20400 0.5340 15557176
0.0032 42.8274 20600 0.5298 15709912
0.0 43.2432 20800 0.5290 15862336
0.0 43.6590 21000 0.5436 16014304
0.0 44.0748 21200 0.5911 16166680
0.0 44.4906 21400 0.5475 16320408
0.0286 44.9064 21600 0.3158 16472888
0.0003 45.3222 21800 0.3074 16625808
0.0001 45.7380 22000 0.3723 16778288
0.0001 46.1538 22200 0.3680 16931560
0.0 46.5696 22400 0.3607 17083880
0.0055 46.9854 22600 0.3653 17235976
0.0 47.4012 22800 0.4029 17388152
0.0 47.8170 23000 0.4166 17540824
0.0 48.2328 23200 0.4366 17693912
0.0033 48.6486 23400 0.4518 17846296
0.0 49.0644 23600 0.4661 17998760
0.0 49.4802 23800 0.4758 18152072
0.0077 49.8960 24000 0.4830 18304072
0.0 50.3119 24200 0.5110 18455696
0.0 50.7277 24400 0.4613 18608976
0.0 51.1435 24600 0.4693 18760928
0.0 51.5593 24800 0.4784 18913856
0.0 51.9751 25000 0.4891 19066528
0.0039 52.3909 25200 0.4925 19218616
0.0 52.8067 25400 0.4992 19370872
0.0 53.2225 25600 0.5116 19524232
0.0 53.6383 25800 0.5076 19676456
0.0039 54.0541 26000 0.5196 19828504
0.0035 54.4699 26200 0.5291 19980856
0.0082 54.8857 26400 0.5269 20133784
0.0034 55.3015 26600 0.5312 20286120
0.0 55.7173 26800 0.5348 20439016
0.0 56.1331 27000 0.5420 20591320
0.0 56.5489 27200 0.5479 20743736
0.0 56.9647 27400 0.5454 20896184
0.0 57.3805 27600 0.5553 21049160
0.0 57.7963 27800 0.5475 21201640
0.0 58.2121 28000 0.5556 21354208
0.0 58.6279 28200 0.5548 21506752
0.0 59.0437 28400 0.5586 21659696
0.0 59.4595 28600 0.5589 21811600
0.0 59.8753 28800 0.5692 21964272
0.0 60.2911 29000 0.5551 22116648
0.0 60.7069 29200 0.5598 22269032
0.0 61.1227 29400 0.5826 22421944
0.0 61.5385 29600 0.5761 22574936
0.0 61.9543 29800 0.5788 22727064
0.0 62.3701 30000 0.5871 22880256
0.0052 62.7859 30200 0.5822 23032800
0.0 63.2017 30400 0.5841 23184744
0.0 63.6175 30600 0.5874 23336904
0.0 64.0333 30800 0.5867 23489432
0.0 64.4491 31000 0.5869 23641496
0.0 64.8649 31200 0.5907 23794744
0.0 65.2807 31400 0.5865 23947688
0.0036 65.6965 31600 0.5954 24099432
0.0038 66.1123 31800 0.5980 24251200
0.0 66.5281 32000 0.6078 24404736
0.0042 66.9439 32200 0.6006 24557120
0.0 67.3597 32400 0.6087 24709616
0.0 67.7755 32600 0.6031 24862224
0.0 68.1913 32800 0.5994 25015296
0.0 68.6071 33000 0.6054 25167744
0.0 69.0229 33200 0.6113 25321016
0.0036 69.4387 33400 0.6101 25473368
0.0 69.8545 33600 0.6077 25626520
0.004 70.2703 33800 0.6100 25778248
0.0 70.6861 34000 0.6114 25930920
0.0 71.1019 34200 0.6114 26083456
0.0 71.5177 34400 0.6124 26235552
0.0 71.9335 34600 0.6145 26388832
0.0 72.3493 34800 0.6170 26541680
0.0 72.7651 35000 0.6138 26694832
0.0038 73.1809 35200 0.6169 26847168
0.0 73.5967 35400 0.6165 27000096
0.0 74.0125 35600 0.6209 27151800
0.0038 74.4283 35800 0.6200 27304152
0.0 74.8441 36000 0.6179 27456856
0.0 75.2599 36200 0.6214 27610376
0.0 75.6757 36400 0.6216 27762984
0.0 76.0915 36600 0.6185 27915504
0.0036 76.5073 36800 0.6197 28068432
0.0035 76.9231 37000 0.6194 28220720
0.0 77.3389 37200 0.6196 28373600
0.0 77.7547 37400 0.6249 28526304
0.0039 78.1705 37600 0.6229 28678672
0.0 78.5863 37800 0.6223 28831632
0.0 79.0021 38000 0.6232 28983144
0.0 79.4179 38200 0.6247 29136008
0.0 79.8337 38400 0.6213 29288104
0.0 80.2495 38600 0.6222 29440312
0.0035 80.6653 38800 0.6235 29592888
0.0 81.0811 39000 0.6225 29745320
0.0 81.4969 39200 0.6237 29898600
0.0 81.9127 39400 0.6233 30050504
0.0039 82.3285 39600 0.6214 30203576
0.0 82.7443 39800 0.6212 30356408
0.0 83.1601 40000 0.6216 30508240

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1744902673

Adapter
(2100)
this model

Evaluation results