train_mrpc_1744902648

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0891
  • Num Input Tokens Seen: 65784064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1106 0.9685 200 0.1064 329312
0.0966 1.9395 400 0.0891 658560
0.0443 2.9104 600 0.1089 987040
0.0131 3.8814 800 0.1504 1316448
0.0289 4.8523 1000 0.1821 1644608
0.0025 5.8232 1200 0.2576 1974016
0.008 6.7942 1400 0.3106 2303584
0.0004 7.7651 1600 0.3199 2630688
0.0002 8.7361 1800 0.2740 2959808
0.0002 9.7070 2000 0.3280 3287584
0.0 10.6780 2200 0.4333 3617920
0.0002 11.6489 2400 0.3073 3945536
0.0257 12.6199 2600 0.3878 4274560
0.0 13.5908 2800 0.4039 4603168
0.0001 14.5617 3000 0.2973 4932448
0.0002 15.5327 3200 0.3028 5261312
0.0 16.5036 3400 0.3795 5589632
0.0038 17.4746 3600 0.3876 5918112
0.0 18.4455 3800 0.4295 6246368
0.0001 19.4165 4000 0.2873 6574848
0.0006 20.3874 4200 0.2986 6903520
0.0264 21.3584 4400 0.3412 7231904
0.0 22.3293 4600 0.3748 7561504
0.0 23.3002 4800 0.3126 7890912
0.0 24.2712 5000 0.4661 8218592
0.0 25.2421 5200 0.4051 8548256
0.0 26.2131 5400 0.3985 8876704
0.0 27.1840 5600 0.4093 9206272
0.0 28.1550 5800 0.5023 9534720
0.0 29.1259 6000 0.6251 9864384
0.0 30.0969 6200 0.5468 10193376
0.0 31.0678 6400 0.5096 10521952
0.0 32.0387 6600 0.4967 10851520
0.0259 33.0097 6800 0.3423 11180544
0.0002 33.9782 7000 0.4065 11509344
0.0022 34.9492 7200 0.5055 11838208
0.001 35.9201 7400 0.5989 12167872
0.0 36.8910 7600 0.4378 12496352
0.0001 37.8620 7800 0.4385 12826048
0.0 38.8329 8000 0.4288 13155040
0.0001 39.8039 8200 0.3511 13483008
0.0691 40.7748 8400 0.3293 13812064
0.0 41.7458 8600 0.4061 14140576
0.0 42.7167 8800 0.7182 14469248
0.0 43.6877 9000 0.7030 14796672
0.0 44.6586 9200 0.7233 15126752
0.0 45.6295 9400 0.7333 15456160
0.0 46.6005 9600 0.7361 15784928
0.0 47.5714 9800 0.7361 16113248
0.0 48.5424 10000 0.7378 16442496
0.0 49.5133 10200 0.7399 16772640
0.0 50.4843 10400 0.7485 17100000
0.0 51.4552 10600 0.7461 17428768
0.0 52.4262 10800 0.7470 17757344
0.0 53.3971 11000 0.7524 18085920
0.0 54.3680 11200 0.7506 18414336
0.0 55.3390 11400 0.7482 18743040
0.0 56.3099 11600 0.7565 19072928
0.0 57.2809 11800 0.7587 19401376
0.0 58.2518 12000 0.7549 19730336
0.0 59.2228 12200 0.7610 20059488
0.0 60.1937 12400 0.7590 20388064
0.0 61.1646 12600 0.7599 20718144
0.0 62.1356 12800 0.7611 21048224
0.0 63.1065 13000 0.7668 21376576
0.0 64.0775 13200 0.7647 21706080
0.0 65.0484 13400 0.7649 22034624
0.0 66.0194 13600 0.7699 22364128
0.0 66.9879 13800 0.7647 22692352
0.0 67.9588 14000 0.7653 23020864
0.0 68.9298 14200 0.7652 23349920
0.0 69.9007 14400 0.7689 23679072
0.0 70.8717 14600 0.7608 24007776
0.0 71.8426 14800 0.7702 24336640
0.0 72.8136 15000 0.7735 24664576
0.0 73.7845 15200 0.7627 24994848
0.0 74.7554 15400 0.7694 25322720
0.0 75.7264 15600 0.7660 25650784
0.0 76.6973 15800 0.7703 25980512
0.0 77.6683 16000 0.7700 26309536
0.0 78.6392 16200 0.7677 26638944
0.0 79.6102 16400 0.7702 26967360
0.0 80.5811 16600 0.7680 27297120
0.0 81.5521 16800 0.7669 27626144
0.0 82.5230 17000 0.7676 27954656
0.0 83.4939 17200 0.7735 28284160
0.0 84.4649 17400 0.7649 28612224
0.0 85.4358 17600 0.7628 28940448
0.0 86.4068 17800 0.7705 29270912
0.0 87.3777 18000 0.7683 29599424
0.0 88.3487 18200 0.7718 29929280
0.0 89.3196 18400 0.7758 30257504
0.0 90.2906 18600 0.7703 30586944
0.0 91.2615 18800 0.7651 30915744
0.0 92.2324 19000 0.7678 31245216
0.0 93.2034 19200 0.7678 31573600
0.0 94.1743 19400 0.7686 31903616
0.0 95.1453 19600 0.7678 32232032
0.0 96.1162 19800 0.7687 32560480
0.0 97.0872 20000 0.7721 32889696
0.0 98.0581 20200 0.7724 33218016
0.0 99.0291 20400 0.7710 33547296
0.0 99.9976 20600 0.7661 33876000
0.0 100.9685 20800 0.7652 34205376
0.0 101.9395 21000 0.7652 34534496
0.0 102.9104 21200 0.7625 34864000
0.0 103.8814 21400 0.7603 35192256
0.0 104.8523 21600 0.7606 35521376
0.0 105.8232 21800 0.7601 35851264
0.0 106.7942 22000 0.7536 36180000
0.0 107.7651 22200 0.7570 36508832
0.0 108.7361 22400 0.7569 36837600
0.0 109.7070 22600 0.7569 37166720
0.0 110.6780 22800 0.7584 37495520
0.0 111.6489 23000 0.7586 37824352
0.0 112.6199 23200 0.7604 38153856
0.0 113.5908 23400 0.7584 38483200
0.0 114.5617 23600 0.7666 38812672
0.0 115.5327 23800 0.7607 39142400
0.0 116.5036 24000 0.7654 39471200
0.0 117.4746 24200 0.7606 39798848
0.0 118.4455 24400 0.7631 40127360
0.0 119.4165 24600 0.7665 40456736
0.0 120.3874 24800 0.7658 40785312
0.0 121.3584 25000 0.7621 41112576
0.0 122.3293 25200 0.7642 41442112
0.0 123.3002 25400 0.7615 41771552
0.0 124.2712 25600 0.7563 42101248
0.0 125.2421 25800 0.7604 42427392
0.0 126.2131 26000 0.7666 42756704
0.0 127.1840 26200 0.7632 43085664
0.0 128.1550 26400 0.7665 43414240
0.0 129.1259 26600 0.7591 43743072
0.0 130.0969 26800 0.7606 44072768
0.0 131.0678 27000 0.7612 44400192
0.0 132.0387 27200 0.7602 44729632
0.0 133.0097 27400 0.7649 45058976
0.0 133.9782 27600 0.7607 45388352
0.0 134.9492 27800 0.7621 45717952
0.0 135.9201 28000 0.7579 46046144
0.0 136.8910 28200 0.7559 46375168
0.0 137.8620 28400 0.7613 46702816
0.0 138.8329 28600 0.7613 47033152
0.0 139.8039 28800 0.7628 47361472
0.0 140.7748 29000 0.7639 47691424
0.0 141.7458 29200 0.7615 48019712
0.0 142.7167 29400 0.7634 48348832
0.0 143.6877 29600 0.7637 48678560
0.0 144.6586 29800 0.7647 49008256
0.0 145.6295 30000 0.7650 49337088
0.0 146.6005 30200 0.7596 49665344
0.0 147.5714 30400 0.7633 49996128
0.0 148.5424 30600 0.7592 50324736
0.0 149.5133 30800 0.7660 50652864
0.0 150.4843 31000 0.7634 50981920
0.0 151.4552 31200 0.7669 51310752
0.0 152.4262 31400 0.7651 51640352
0.0 153.3971 31600 0.7663 51969184
0.0 154.3680 31800 0.7648 52297280
0.0 155.3390 32000 0.7647 52625600
0.0 156.3099 32200 0.7656 52953920
0.0 157.2809 32400 0.7650 53283648
0.0 158.2518 32600 0.7676 53613056
0.0 159.2228 32800 0.7651 53941632
0.0 160.1937 33000 0.7681 54270272
0.0 161.1646 33200 0.7694 54599104
0.0 162.1356 33400 0.7648 54929056
0.0 163.1065 33600 0.7708 55257728
0.0 164.0775 33800 0.7679 55587456
0.0 165.0484 34000 0.7728 55916576
0.0 166.0194 34200 0.7715 56245664
0.0 166.9879 34400 0.7739 56574272
0.0 167.9588 34600 0.7644 56903360
0.0 168.9298 34800 0.7681 57232032
0.0 169.9007 35000 0.7642 57561504
0.0 170.8717 35200 0.7749 57891168
0.0 171.8426 35400 0.7676 58220352
0.0 172.8136 35600 0.7688 58548960
0.0 173.7845 35800 0.7699 58878688
0.0 174.7554 36000 0.7651 59207104
0.0 175.7264 36200 0.7687 59536800
0.0 176.6973 36400 0.7725 59865312
0.0 177.6683 36600 0.7651 60194816
0.0 178.6392 36800 0.7669 60523584
0.0 179.6102 37000 0.7706 60852352
0.0 180.5811 37200 0.7701 61181024
0.0 181.5521 37400 0.7670 61510624
0.0 182.5230 37600 0.7723 61840672
0.0 183.4939 37800 0.7689 62167808
0.0 184.4649 38000 0.7657 62496960
0.0 185.4358 38200 0.7721 62826016
0.0 186.4068 38400 0.7677 63154784
0.0 187.3777 38600 0.7733 63483904
0.0 188.3487 38800 0.7654 63811808
0.0 189.3196 39000 0.7660 64139488
0.0 190.2906 39200 0.7699 64467808
0.0 191.2615 39400 0.7635 64798112
0.0 192.2324 39600 0.7690 65126304
0.0 193.2034 39800 0.7654 65455776
0.0 194.1743 40000 0.7646 65784064

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902648

Adapter
(2100)
this model

Evaluation results