train_mrpc_1744902649

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1272
  • Num Input Tokens Seen: 65784064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1792 0.9685 200 0.1595 329312
0.1616 1.9395 400 0.1593 658560
0.1105 2.9104 600 0.1366 987040
0.093 3.8814 800 0.1343 1316448
0.1363 4.8523 1000 0.1369 1644608
0.1061 5.8232 1200 0.1310 1974016
0.1695 6.7942 1400 0.1355 2303584
0.0548 7.7651 1600 0.1272 2630688
0.1025 8.7361 1800 0.1282 2959808
0.1103 9.7070 2000 0.1275 3287584
0.073 10.6780 2200 0.1276 3617920
0.0808 11.6489 2400 0.1375 3945536
0.0832 12.6199 2600 0.1372 4274560
0.043 13.5908 2800 0.1441 4603168
0.1192 14.5617 3000 0.1549 4932448
0.0522 15.5327 3200 0.1576 5261312
0.0495 16.5036 3400 0.1620 5589632
0.0463 17.4746 3600 0.1867 5918112
0.0316 18.4455 3800 0.2041 6246368
0.0302 19.4165 4000 0.2137 6574848
0.0489 20.3874 4200 0.2474 6903520
0.0428 21.3584 4400 0.2885 7231904
0.0092 22.3293 4600 0.2955 7561504
0.0247 23.3002 4800 0.3410 7890912
0.0108 24.2712 5000 0.3752 8218592
0.003 25.2421 5200 0.3848 8548256
0.0001 26.2131 5400 0.4551 8876704
0.0038 27.1840 5600 0.5005 9206272
0.0011 28.1550 5800 0.5081 9534720
0.0186 29.1259 6000 0.5632 9864384
0.0001 30.0969 6200 0.6207 10193376
0.0001 31.0678 6400 0.6485 10521952
0.0001 32.0387 6600 0.6780 10851520
0.0 33.0097 6800 0.7211 11180544
0.0001 33.9782 7000 0.7425 11509344
0.0 34.9492 7200 0.7592 11838208
0.0022 35.9201 7400 0.7934 12167872
0.0 36.8910 7600 0.7808 12496352
0.0 37.8620 7800 0.8029 12826048
0.0 38.8329 8000 0.8225 13155040
0.0003 39.8039 8200 0.8472 13483008
0.0 40.7748 8400 0.8385 13812064
0.0 41.7458 8600 0.9110 14140576
0.0001 42.7167 8800 0.8721 14469248
0.0006 43.6877 9000 0.8889 14796672
0.0002 44.6586 9200 0.8948 15126752
0.0042 45.6295 9400 0.9480 15456160
0.0006 46.6005 9600 0.8594 15784928
0.0 47.5714 9800 0.9047 16113248
0.0 48.5424 10000 0.8897 16442496
0.0 49.5133 10200 0.8949 16772640
0.0 50.4843 10400 0.8780 17100000
0.0 51.4552 10600 0.8463 17428768
0.0 52.4262 10800 0.8947 17757344
0.0 53.3971 11000 0.8721 18085920
0.0 54.3680 11200 0.8789 18414336
0.0 55.3390 11400 0.8813 18743040
0.0 56.3099 11600 0.8915 19072928
0.0 57.2809 11800 0.8866 19401376
0.0 58.2518 12000 0.8915 19730336
0.0 59.2228 12200 0.8939 20059488
0.0 60.1937 12400 0.8958 20388064
0.0 61.1646 12600 0.8991 20718144
0.0 62.1356 12800 0.9055 21048224
0.0 63.1065 13000 0.9546 21376576
0.0029 64.0775 13200 0.9045 21706080
0.0 65.0484 13400 0.9358 22034624
0.0 66.0194 13600 0.8919 22364128
0.0 66.9879 13800 0.8877 22692352
0.0101 67.9588 14000 0.8636 23020864
0.0 68.9298 14200 0.9585 23349920
0.0 69.9007 14400 0.8971 23679072
0.0 70.8717 14600 0.8881 24007776
0.0 71.8426 14800 0.9130 24336640
0.0 72.8136 15000 0.9017 24664576
0.0 73.7845 15200 0.9239 24994848
0.0 74.7554 15400 0.9034 25322720
0.0 75.7264 15600 0.9104 25650784
0.0013 76.6973 15800 0.9375 25980512
0.0007 77.6683 16000 0.9748 26309536
0.0 78.6392 16200 0.9272 26638944
0.0 79.6102 16400 0.9310 26967360
0.0 80.5811 16600 0.9371 27297120
0.0 81.5521 16800 0.9354 27626144
0.0 82.5230 17000 0.9427 27954656
0.0 83.4939 17200 0.9468 28284160
0.0 84.4649 17400 0.9542 28612224
0.0 85.4358 17600 0.9413 28940448
0.0 86.4068 17800 0.9482 29270912
0.0 87.3777 18000 0.9457 29599424
0.0 88.3487 18200 0.9610 29929280
0.0 89.3196 18400 0.9695 30257504
0.0 90.2906 18600 0.9427 30586944
0.0 91.2615 18800 0.9648 30915744
0.0 92.2324 19000 0.9566 31245216
0.0 93.2034 19200 1.0410 31573600
0.0 94.1743 19400 1.0290 31903616
0.0 95.1453 19600 1.0224 32232032
0.0 96.1162 19800 1.0002 32560480
0.0 97.0872 20000 1.0333 32889696
0.0 98.0581 20200 0.9999 33218016
0.0 99.0291 20400 1.0188 33547296
0.0 99.9976 20600 1.0259 33876000
0.0 100.9685 20800 1.0148 34205376
0.0 101.9395 21000 1.0062 34534496
0.0 102.9104 21200 0.9976 34864000
0.0 103.8814 21400 1.0242 35192256
0.0 104.8523 21600 1.0044 35521376
0.0 105.8232 21800 1.0179 35851264
0.0 106.7942 22000 1.0085 36180000
0.0 107.7651 22200 1.0040 36508832
0.0 108.7361 22400 1.0053 36837600
0.0 109.7070 22600 0.9748 37166720
0.0 110.6780 22800 1.0201 37495520
0.0 111.6489 23000 1.0137 37824352
0.0 112.6199 23200 1.0274 38153856
0.0 113.5908 23400 1.0198 38483200
0.0 114.5617 23600 1.0236 38812672
0.0 115.5327 23800 1.0075 39142400
0.0 116.5036 24000 1.0092 39471200
0.0 117.4746 24200 1.0208 39798848
0.0 118.4455 24400 1.0163 40127360
0.0 119.4165 24600 1.0297 40456736
0.0 120.3874 24800 1.0208 40785312
0.0 121.3584 25000 1.0032 41112576
0.0 122.3293 25200 1.0071 41442112
0.0 123.3002 25400 1.0182 41771552
0.0 124.2712 25600 1.0241 42101248
0.0 125.2421 25800 0.9986 42427392
0.0 126.2131 26000 1.0178 42756704
0.0 127.1840 26200 1.0377 43085664
0.0 128.1550 26400 1.0162 43414240
0.0 129.1259 26600 1.0307 43743072
0.0 130.0969 26800 1.0224 44072768
0.0 131.0678 27000 1.0235 44400192
0.0 132.0387 27200 1.0353 44729632
0.0 133.0097 27400 1.0296 45058976
0.0 133.9782 27600 1.0324 45388352
0.0 134.9492 27800 1.0443 45717952
0.0 135.9201 28000 1.0478 46046144
0.0 136.8910 28200 1.0435 46375168
0.0 137.8620 28400 1.0442 46702816
0.0 138.8329 28600 1.0448 47033152
0.0 139.8039 28800 1.0729 47361472
0.0 140.7748 29000 1.0439 47691424
0.0 141.7458 29200 1.0689 48019712
0.0 142.7167 29400 1.0791 48348832
0.0 143.6877 29600 1.0849 48678560
0.0 144.6586 29800 1.0461 49008256
0.0 145.6295 30000 1.0701 49337088
0.0 146.6005 30200 1.0699 49665344
0.0 147.5714 30400 1.0625 49996128
0.0 148.5424 30600 1.0711 50324736
0.0 149.5133 30800 1.0653 50652864
0.0 150.4843 31000 1.0867 50981920
0.0 151.4552 31200 1.0732 51310752
0.0 152.4262 31400 1.0587 51640352
0.0 153.3971 31600 1.0614 51969184
0.0 154.3680 31800 1.0761 52297280
0.0 155.3390 32000 1.0690 52625600
0.0 156.3099 32200 1.0777 52953920
0.0 157.2809 32400 1.0818 53283648
0.0 158.2518 32600 1.0866 53613056
0.0 159.2228 32800 1.0812 53941632
0.0 160.1937 33000 1.0887 54270272
0.0 161.1646 33200 1.0782 54599104
0.0 162.1356 33400 1.0808 54929056
0.0 163.1065 33600 1.0965 55257728
0.0 164.0775 33800 1.0854 55587456
0.0 165.0484 34000 1.0979 55916576
0.0 166.0194 34200 1.0962 56245664
0.0 166.9879 34400 1.1092 56574272
0.0 167.9588 34600 1.1052 56903360
0.0 168.9298 34800 1.1229 57232032
0.0 169.9007 35000 1.0853 57561504
0.0 170.8717 35200 1.1070 57891168
0.0 171.8426 35400 1.1014 58220352
0.0 172.8136 35600 1.1065 58548960
0.0 173.7845 35800 1.0964 58878688
0.0 174.7554 36000 1.0980 59207104
0.0 175.7264 36200 1.1023 59536800
0.0 176.6973 36400 1.0831 59865312
0.0 177.6683 36600 1.0948 60194816
0.0 178.6392 36800 1.1205 60523584
0.0 179.6102 37000 1.0883 60852352
0.0 180.5811 37200 1.0916 61181024
0.0 181.5521 37400 1.1090 61510624
0.0 182.5230 37600 1.1083 61840672
0.0 183.4939 37800 1.1169 62167808
0.0 184.4649 38000 1.1141 62496960
0.0 185.4358 38200 1.0932 62826016
0.0 186.4068 38400 1.1050 63154784
0.0 187.3777 38600 1.0873 63483904
0.0 188.3487 38800 1.1244 63811808
0.0 189.3196 39000 1.1015 64139488
0.0 190.2906 39200 1.0930 64467808
0.0 191.2615 39400 1.0899 64798112
0.0 192.2324 39600 1.0952 65126304
0.0 193.2034 39800 1.1142 65455776
0.0 194.1743 40000 1.1018 65784064

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902649

Adapter
(2101)
this model

Evaluation results