train_cola_1744902677

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1203
  • Num Input Tokens Seen: 28700680

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1311 0.4158 200 0.1433 143936
0.1067 0.8316 400 0.1354 287392
0.139 1.2474 600 0.1735 430968
0.121 1.6632 800 0.1234 574456
0.0355 2.0790 1000 0.1671 718448
0.0616 2.4948 1200 0.2002 862224
0.0803 2.9106 1400 0.1203 1004880
0.0439 3.3264 1600 0.1210 1148296
0.0088 3.7422 1800 0.1888 1292616
0.0233 4.1580 2000 0.2052 1436240
0.0334 4.5738 2200 0.1831 1579408
0.0587 4.9896 2400 0.2397 1723056
0.0174 5.4054 2600 0.2203 1866504
0.0221 5.8212 2800 0.1858 2009832
0.0306 6.2370 3000 0.2578 2153504
0.031 6.6528 3200 0.2714 2296672
0.0003 7.0686 3400 0.2684 2440240
0.0014 7.4844 3600 0.2166 2583952
0.0006 7.9002 3800 0.2416 2727536
0.0003 8.3160 4000 0.3949 2870176
0.0153 8.7318 4200 0.3606 3013792
0.002 9.1476 4400 0.3122 3157976
0.0156 9.5634 4600 0.2466 3301400
0.0299 9.9792 4800 0.3468 3445528
0.0007 10.3950 5000 0.2433 3588176
0.0023 10.8108 5200 0.3589 3731888
0.0034 11.2266 5400 0.2928 3876072
0.0208 11.6424 5600 0.4505 4020200
0.0005 12.0582 5800 0.2404 4162880
0.0001 12.4740 6000 0.3337 4305664
0.0001 12.8898 6200 0.3213 4449504
0.0972 13.3056 6400 0.2284 4592824
0.003 13.7214 6600 0.3707 4737208
0.0002 14.1372 6800 0.3002 4880104
0.0005 14.5530 7000 0.2631 5024232
0.0003 14.9688 7200 0.3407 5167336
0.0 15.3846 7400 0.3447 5311512
0.0001 15.8004 7600 0.3414 5454712
0.0045 16.2162 7800 0.3096 5598576
0.0264 16.6320 8000 0.3564 5741776
0.0128 17.0478 8200 0.3482 5885896
0.0139 17.4636 8400 0.2676 6030472
0.0002 17.8794 8600 0.3593 6172872
0.0102 18.2952 8800 0.3862 6316224
0.0006 18.7110 9000 0.2775 6460064
0.0344 19.1268 9200 0.3549 6603384
0.0001 19.5426 9400 0.4113 6746616
0.0027 19.9584 9600 0.3334 6890808
0.008 20.3742 9800 0.3709 7033840
0.0092 20.7900 10000 0.2860 7177136
0.0005 21.2058 10200 0.3199 7320168
0.0 21.6216 10400 0.3846 7464136
0.0097 22.0374 10600 0.3253 7607816
0.0008 22.4532 10800 0.3285 7751560
0.0032 22.8690 11000 0.3343 7895400
0.0 23.2848 11200 0.3723 8038480
0.0 23.7006 11400 0.3910 8182416
0.0106 24.1164 11600 0.3851 8325888
0.0 24.5322 11800 0.4832 8468992
0.0 24.9480 12000 0.3957 8612096
0.0001 25.3638 12200 0.3543 8756152
0.0001 25.7796 12400 0.3097 8899640
0.0366 26.1954 12600 0.3154 9042656
0.0173 26.6112 12800 0.2774 9186656
0.0081 27.0270 13000 0.2980 9329688
0.0 27.4428 13200 0.3174 9472184
0.0011 27.8586 13400 0.2576 9616056
0.0031 28.2744 13600 0.2584 9759824
0.0002 28.6902 13800 0.3036 9903824
0.0 29.1060 14000 0.3940 10046680
0.0039 29.5218 14200 0.3441 10190040
0.0041 29.9376 14400 0.3373 10333816
0.0 30.3534 14600 0.3869 10476752
0.0 30.7692 14800 0.3500 10620240
0.0 31.1850 15000 0.3737 10763368
0.0 31.6008 15200 0.3848 10906568
0.0029 32.0166 15400 0.3766 11049768
0.0 32.4324 15600 0.4048 11193256
0.0 32.8482 15800 0.3986 11336648
0.0 33.2640 16000 0.4212 11481080
0.0 33.6798 16200 0.4406 11624376
0.0 34.0956 16400 0.4276 11766832
0.0 34.5114 16600 0.4458 11910672
0.0 34.9272 16800 0.4527 12054512
0.0 35.3430 17000 0.4412 12198464
0.0 35.7588 17200 0.4428 12341536
0.0058 36.1746 17400 0.4565 12485368
0.0 36.5904 17600 0.4381 12629496
0.0031 37.0062 17800 0.4719 12772208
0.0 37.4220 18000 0.4619 12915888
0.0 37.8378 18200 0.4839 13058896
0.0 38.2536 18400 0.4938 13201856
0.0028 38.6694 18600 0.4937 13344736
0.0 39.0852 18800 0.4802 13489016
0.0 39.5010 19000 0.4856 13632312
0.0 39.9168 19200 0.4766 13775960
0.0 40.3326 19400 0.4967 13918888
0.0 40.7484 19600 0.4896 14062184
0.0 41.1642 19800 0.4948 14206632
0.0043 41.5800 20000 0.4883 14349800
0.0001 41.9958 20200 0.2996 14493096
0.0007 42.4116 20400 0.4004 14636824
0.0028 42.8274 20600 0.3681 14780056
0.0 43.2432 20800 0.3692 14922952
0.0 43.6590 21000 0.4385 15066120
0.0001 44.0748 21200 0.3077 15209536
0.0 44.4906 21400 0.3709 15353920
0.0003 44.9064 21600 0.3059 15497376
0.0 45.3222 21800 0.4040 15641208
0.0 45.7380 22000 0.4052 15784536
0.0 46.1538 22200 0.4251 15928528
0.0 46.5696 22400 0.4395 16072048
0.0047 46.9854 22600 0.4411 16214832
0.0 47.4012 22800 0.4575 16358208
0.0 47.8170 23000 0.4556 16501568
0.0 48.2328 23200 0.4667 16645480
0.0029 48.6486 23400 0.4667 16789224
0.0 49.0644 23600 0.4729 16932768
0.0 49.4802 23800 0.4798 17076672
0.0065 49.8960 24000 0.4823 17220000
0.0 50.3119 24200 0.4938 17363816
0.0 50.7277 24400 0.4896 17508072
0.0 51.1435 24600 0.4884 17651488
0.0 51.5593 24800 0.4937 17795328
0.0 51.9751 25000 0.4991 17938368
0.0038 52.3909 25200 0.5048 18081176
0.0 52.8067 25400 0.5042 18224696
0.0 53.2225 25600 0.5104 18369136
0.0 53.6383 25800 0.5152 18511824
0.0036 54.0541 26000 0.5176 18655008
0.0035 54.4699 26200 0.5177 18798592
0.0078 54.8857 26400 0.5097 18942016
0.0026 55.3015 26600 0.5227 19085296
0.0 55.7173 26800 0.5114 19229616
0.0 56.1331 27000 0.5264 19373160
0.0 56.5489 27200 0.5340 19516200
0.0 56.9647 27400 0.5332 19659656
0.0 57.3805 27600 0.5440 19803672
0.0 57.7963 27800 0.5223 19947800
0.0 58.2121 28000 0.5306 20090864
0.0 58.6279 28200 0.5407 20234160
0.0 59.0437 28400 0.5396 20378152
0.0 59.4595 28600 0.5429 20521096
0.0 59.8753 28800 0.5494 20664744
0.0 60.2911 29000 0.5362 20808544
0.0 60.7069 29200 0.5364 20952064
0.0 61.1227 29400 0.5462 21095536
0.0 61.5385 29600 0.5510 21239216
0.0 61.9543 29800 0.5519 21382704
0.0 62.3701 30000 0.5546 21526584
0.0047 62.7859 30200 0.5530 21670744
0.0 63.2017 30400 0.5613 21813952
0.0 63.6175 30600 0.5594 21956992
0.0 64.0333 30800 0.5598 22100720
0.0 64.4491 31000 0.5621 22244240
0.0 64.8649 31200 0.5619 22388368
0.0 65.2807 31400 0.5645 22531840
0.0034 65.6965 31600 0.5629 22674688
0.0036 66.1123 31800 0.5613 22817880
0.0 66.5281 32000 0.5609 22962360
0.0036 66.9439 32200 0.5536 23105624
0.0 67.3597 32400 0.5602 23248272
0.0 67.7755 32600 0.5651 23391888
0.0 68.1913 32800 0.5636 23535616
0.0 68.6071 33000 0.5703 23678976
0.0 69.0229 33200 0.5698 23823128
0.003 69.4387 33400 0.5710 23966488
0.0 69.8545 33600 0.5720 24110648
0.0037 70.2703 33800 0.5737 24253072
0.0 70.6861 34000 0.5757 24396528
0.0 71.1019 34200 0.5759 24540040
0.0 71.5177 34400 0.5748 24683144
0.0 71.9335 34600 0.5787 24827048
0.0 72.3493 34800 0.5809 24970840
0.0 72.7651 35000 0.5762 25115672
0.0033 73.1809 35200 0.5791 25258416
0.0 73.5967 35400 0.5801 25402448
0.0 74.0125 35600 0.5811 25545128
0.0037 74.4283 35800 0.5834 25688392
0.0 74.8441 36000 0.5839 25831720
0.0 75.2599 36200 0.5841 25975928
0.0 75.6757 36400 0.5848 26119704
0.0 76.0915 36600 0.5840 26262696
0.0033 76.5073 36800 0.5832 26406024
0.0032 76.9231 37000 0.5844 26550088
0.0 77.3389 37200 0.5852 26693856
0.0 77.7547 37400 0.5854 26837120
0.0035 78.1705 37600 0.5861 26980600
0.0 78.5863 37800 0.5848 27124888
0.0 79.0021 38000 0.5857 27266800
0.0 79.4179 38200 0.5850 27410736
0.0 79.8337 38400 0.5854 27553360
0.0 80.2495 38600 0.5852 27696864
0.0026 80.6653 38800 0.5859 27839840
0.0 81.0811 39000 0.5866 27983384
0.0 81.4969 39200 0.5867 28127512
0.0 81.9127 39400 0.5856 28270104
0.0035 82.3285 39600 0.5849 28413680
0.0 82.7443 39800 0.5855 28557552
0.0 83.1601 40000 0.5877 28700680

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1744902677

Adapter
(790)
this model