train_mrpc_1744902652

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1130
  • Num Input Tokens Seen: 69324064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.147 0.9685 200 0.1292 346816
0.1336 1.9395 400 0.1130 694112
0.023 2.9104 600 0.1525 1040448
0.0171 3.8814 800 0.2220 1386944
0.001 4.8523 1000 0.2616 1733568
0.0004 5.8232 1200 0.3320 2080576
0.0 6.7942 1400 0.3721 2428000
0.0 7.7651 1600 0.4142 2772832
0.0009 8.7361 1800 0.2469 3119936
0.0015 9.7070 2000 0.4015 3464864
0.0005 10.6780 2200 0.4099 3812608
0.0007 11.6489 2400 0.2668 4157312
0.003 12.6199 2600 0.2576 4504256
0.0 13.5908 2800 0.3855 4850880
0.0 14.5617 3000 0.7733 5197664
0.0 15.5327 3200 0.6409 5543392
0.0194 16.5036 3400 0.6548 5889024
0.005 17.4746 3600 0.3614 6234688
0.0022 18.4455 3800 0.5016 6580608
0.0005 19.4165 4000 0.2770 6926432
0.0 20.3874 4200 0.3919 7272896
0.0548 21.3584 4400 0.3511 7618208
0.0001 22.3293 4600 0.4028 7965376
0.0003 23.3002 4800 0.5596 8312352
0.0 24.2712 5000 0.5184 8657568
0.0 25.2421 5200 0.3675 9004576
0.0 26.2131 5400 0.4435 9351552
0.0 27.1840 5600 0.4736 9699840
0.0 28.1550 5800 0.4922 10045120
0.0 29.1259 6000 0.5077 10392096
0.0 30.0969 6200 0.5229 10738624
0.0 31.0678 6400 0.5319 11084512
0.0 32.0387 6600 0.5458 11432096
0.0 33.0097 6800 0.5537 11779520
0.0 33.9782 7000 0.5648 12126016
0.0 34.9492 7200 0.5763 12472160
0.0 35.9201 7400 0.5822 12819680
0.0 36.8910 7600 0.5898 13166368
0.0 37.8620 7800 0.5963 13513280
0.0 38.8329 8000 0.6044 13860256
0.0 39.8039 8200 0.6122 14205856
0.0 40.7748 8400 0.6186 14553152
0.0 41.7458 8600 0.6252 14898752
0.0 42.7167 8800 0.6310 15245344
0.0 43.6877 9000 0.6384 15590560
0.0 44.6586 9200 0.6431 15939776
0.0 45.6295 9400 0.6525 16286016
0.0 46.6005 9600 0.6553 16633088
0.0 47.5714 9800 0.6608 16978656
0.0 48.5424 10000 0.6695 17325024
0.0 49.5133 10200 0.6716 17673440
0.0 50.4843 10400 0.6789 18018272
0.0 51.4552 10600 0.6833 18364992
0.0 52.4262 10800 0.6849 18710720
0.0 53.3971 11000 0.6899 19057408
0.0 54.3680 11200 0.6978 19403360
0.0 55.3390 11400 0.6980 19749408
0.0 56.3099 11600 0.7006 20096416
0.0 57.2809 11800 0.7079 20442944
0.0 58.2518 12000 0.7087 20789120
0.0 59.2228 12200 0.7110 21136768
0.0 60.1937 12400 0.7155 21482944
0.0 61.1646 12600 0.7200 21830400
0.0 62.1356 12800 0.7204 22177696
0.0 63.1065 13000 0.7267 22523776
0.0 64.0775 13200 0.7275 22871744
0.0 65.0484 13400 0.7283 23218432
0.0 66.0194 13600 0.7335 23565280
0.0 66.9879 13800 0.7363 23911616
0.0 67.9588 14000 0.7362 24257984
0.0 68.9298 14200 0.7402 24604960
0.0 69.9007 14400 0.7456 24951648
0.0 70.8717 14600 0.7443 25297664
0.0 71.8426 14800 0.7493 25644032
0.0 72.8136 15000 0.7539 25989408
0.0 73.7845 15200 0.7591 26337760
0.0 74.7554 15400 0.7562 26684800
0.0 75.7264 15600 0.7611 27029856
0.0 76.6973 15800 0.7716 27376160
0.0 77.6683 16000 0.7729 27723904
0.0 78.6392 16200 0.7742 28071104
0.0 79.6102 16400 0.7794 28417344
0.0 80.5811 16600 0.7841 28766240
0.0 81.5521 16800 0.7835 29111104
0.0 82.5230 17000 0.7931 29456800
0.0 83.4939 17200 0.7960 29804640
0.0 84.4649 17400 0.7973 30151168
0.0 85.4358 17600 0.8043 30497536
0.0 86.4068 17800 0.8087 30845536
0.0 87.3777 18000 0.8069 31191456
0.0 88.3487 18200 0.8165 31539136
0.0 89.3196 18400 0.8215 31884000
0.0 90.2906 18600 0.8300 32231584
0.0 91.2615 18800 0.8308 32577088
0.0 92.2324 19000 0.8383 32924768
0.0 93.2034 19200 0.8400 33271392
0.0 94.1743 19400 0.8466 33619232
0.0 95.1453 19600 0.8502 33965280
0.0 96.1162 19800 0.8509 34311712
0.0 97.0872 20000 0.8607 34658112
0.0 98.0581 20200 0.8662 35004384
0.0 99.0291 20400 0.8643 35351392
0.0 99.9976 20600 0.8772 35698272
0.0 100.9685 20800 0.8788 36045088
0.0 101.9395 21000 0.8976 36391968
0.0 102.9104 21200 0.8931 36739040
0.0 103.8814 21400 0.9039 37084768
0.0 104.8523 21600 0.9100 37431808
0.0 105.8232 21800 0.9196 37779232
0.0 106.7942 22000 0.9311 38126112
0.0 107.7651 22200 0.9307 38472672
0.0 108.7361 22400 0.9340 38818464
0.0 109.7070 22600 0.9366 39165472
0.0 110.6780 22800 0.9403 39511328
0.0 111.6489 23000 0.9447 39858048
0.0 112.6199 23200 0.9419 40205184
0.0 113.5908 23400 0.9440 40552448
0.0 114.5617 23600 0.9345 40899872
0.0 115.5327 23800 0.9426 41246848
0.0 116.5036 24000 0.9394 41593088
0.0 117.4746 24200 0.9344 41938464
0.0 118.4455 24400 0.9359 42284064
0.0 119.4165 24600 0.9378 42631296
0.0 120.3874 24800 0.9327 42976992
0.0 121.3584 25000 0.9328 43321920
0.0 122.3293 25200 0.9345 43669344
0.0 123.3002 25400 0.9321 44016096
0.0 124.2712 25600 0.9303 44363232
0.0 125.2421 25800 0.9264 44706400
0.0 126.2131 26000 0.9306 45054080
0.0 127.1840 26200 0.9335 45400864
0.0 128.1550 26400 0.9403 45746688
0.0 129.1259 26600 0.9343 46093216
0.0 130.0969 26800 0.9383 46440960
0.0 131.0678 27000 0.9348 46785984
0.0 132.0387 27200 0.9495 47133856
0.0 133.0097 27400 0.9418 47481088
0.0 133.9782 27600 0.9431 47827904
0.0 134.9492 27800 0.9420 48175392
0.0 135.9201 28000 0.9501 48521536
0.0 136.8910 28200 0.9526 48867904
0.0 137.8620 28400 0.9502 49212704
0.0 138.8329 28600 0.9569 49561312
0.0 139.8039 28800 0.9563 49907264
0.0 140.7748 29000 0.9547 50254720
0.0 141.7458 29200 0.9533 50600480
0.0 142.7167 29400 0.9591 50947456
0.0 143.6877 29600 0.9623 51295040
0.0 144.6586 29800 0.9539 51641376
0.0 145.6295 30000 0.9622 51988288
0.0 146.6005 30200 0.9622 52334112
0.0 147.5714 30400 0.9644 52683008
0.0 148.5424 30600 0.9606 53028128
0.0 149.5133 30800 0.9602 53374400
0.0 150.4843 31000 0.9643 53720704
0.0 151.4552 31200 0.9652 54067392
0.0 152.4262 31400 0.9631 54414880
0.0 153.3971 31600 0.9696 54760672
0.0 154.3680 31800 0.9698 55106400
0.0 155.3390 32000 0.9695 55452512
0.0 156.3099 32200 0.9665 55798400
0.0 157.2809 32400 0.9734 56146592
0.0 158.2518 32600 0.9712 56493696
0.0 159.2228 32800 0.9723 56840064
0.0 160.1937 33000 0.9744 57186368
0.0 161.1646 33200 0.9686 57532416
0.0 162.1356 33400 0.9741 57880832
0.0 163.1065 33600 0.9738 58227680
0.0 164.0775 33800 0.9729 58574880
0.0 165.0484 34000 0.9718 58922528
0.0 166.0194 34200 0.9755 59269760
0.0 166.9879 34400 0.9720 59615872
0.0 167.9588 34600 0.9762 59962368
0.0 168.9298 34800 0.9707 60308640
0.0 169.9007 35000 0.9721 60655616
0.0 170.8717 35200 0.9770 61003136
0.0 171.8426 35400 0.9723 61350016
0.0 172.8136 35600 0.9791 61696224
0.0 173.7845 35800 0.9742 62044256
0.0 174.7554 36000 0.9748 62389792
0.0 175.7264 36200 0.9782 62738496
0.0 176.6973 36400 0.9806 63084544
0.0 177.6683 36600 0.9803 63431712
0.0 178.6392 36800 0.9762 63778656
0.0 179.6102 37000 0.9760 64124736
0.0 180.5811 37200 0.9767 64471808
0.0 181.5521 37400 0.9715 64820352
0.0 182.5230 37600 0.9754 65167904
0.0 183.4939 37800 0.9805 65513280
0.0 184.4649 38000 0.9813 65859136
0.0 185.4358 38200 0.9772 66205888
0.0 186.4068 38400 0.9799 66552576
0.0 187.3777 38600 0.9781 66899904
0.0 188.3487 38800 0.9842 67245856
0.0 189.3196 39000 0.9788 67591648
0.0 190.2906 39200 0.9757 67937440
0.0 191.2615 39400 0.9805 68285088
0.0 192.2324 39600 0.9697 68631104
0.0 193.2034 39800 0.9799 68978016
0.0 194.1743 40000 0.9808 69324064

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1744902652

Adapter
(540)
this model

Evaluation results