train_cb_1745950318

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1614
  • Num Input Tokens Seen: 23078128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4055 3.5133 200 0.2454 116248
0.2939 7.0177 400 0.2808 232144
0.2474 10.5310 600 0.2634 346496
0.3082 14.0354 800 0.2848 462696
0.2262 17.5487 1000 0.1614 578728
0.1393 21.0531 1200 0.2063 692976
0.143 24.5664 1400 0.2579 809080
0.086 28.0708 1600 0.2705 924048
0.0433 31.5841 1800 0.1723 1040096
0.0121 35.0885 2000 0.3711 1155784
0.1588 38.6018 2200 0.2246 1271880
0.0099 42.1062 2400 0.2921 1386392
0.0012 45.6195 2600 0.4970 1502448
0.0003 49.1239 2800 0.4823 1616928
0.0001 52.6372 3000 0.5085 1732240
0.0002 56.1416 3200 0.5206 1847880
0.0001 59.6549 3400 0.5320 1963376
0.0 63.1593 3600 0.5479 2078344
0.0001 66.6726 3800 0.5611 2193696
0.0 70.1770 4000 0.5574 2309024
0.0001 73.6903 4200 0.5837 2425544
0.0 77.1947 4400 0.5738 2539944
0.0 80.7080 4600 0.5903 2655720
0.0 84.2124 4800 0.5898 2771904
0.0 87.7257 5000 0.6149 2887856
0.0 91.2301 5200 0.6054 3003888
0.0 94.7434 5400 0.6059 3118800
0.0 98.2478 5600 0.6126 3234376
0.0 101.7611 5800 0.6133 3350608
0.0 105.2655 6000 0.6364 3466256
0.0 108.7788 6200 0.6208 3582008
0.0 112.2832 6400 0.6296 3696904
0.0 115.7965 6600 0.6318 3812728
0.0 119.3009 6800 0.6352 3927256
0.0 122.8142 7000 0.6349 4043128
0.0 126.3186 7200 0.6475 4158920
0.0 129.8319 7400 0.6394 4274536
0.0 133.3363 7600 0.6457 4389864
0.0 136.8496 7800 0.6506 4505192
0.0 140.3540 8000 0.6416 4620656
0.0 143.8673 8200 0.6481 4736960
0.0 147.3717 8400 0.6563 4850688
0.0 150.8850 8600 0.6460 4965800
0.0 154.3894 8800 0.6556 5082848
0.0 157.9027 9000 0.6582 5197896
0.0 161.4071 9200 0.6413 5312976
0.0 164.9204 9400 0.6502 5428816
0.0 168.4248 9600 0.6680 5542632
0.0 171.9381 9800 0.6518 5660064
0.0 175.4425 10000 0.6473 5775432
0.0 178.9558 10200 0.6585 5891480
0.0 182.4602 10400 0.6544 6006016
0.0 185.9735 10600 0.6506 6121200
0.0 189.4779 10800 0.6595 6236696
0.0 192.9912 11000 0.6428 6352152
0.0 196.4956 11200 0.6445 6467792
0.0 200.0 11400 0.6537 6581880
0.0 203.5133 11600 0.6588 6697328
0.0 207.0177 11800 0.6485 6811792
0.0 210.5310 12000 0.6580 6928248
0.0 214.0354 12200 0.6534 7043832
0.0 217.5487 12400 0.6465 7157984
0.0 221.0531 12600 0.6458 7274032
0.0 224.5664 12800 0.6403 7390136
0.0 228.0708 13000 0.6578 7505120
0.0 231.5841 13200 0.6455 7619616
0.0 235.0885 13400 0.6436 7736064
0.0 238.6018 13600 0.6464 7850792
0.0 242.1062 13800 0.6585 7965808
0.0 245.6195 14000 0.6507 8081552
0.0 249.1239 14200 0.6523 8197208
0.0 252.6372 14400 0.6460 8312272
0.0 256.1416 14600 0.6626 8426888
0.0 259.6549 14800 0.6376 8542448
0.0 263.1593 15000 0.6489 8658448
0.0 266.6726 15200 0.6494 8773608
0.0 270.1770 15400 0.6541 8887928
0.0 273.6903 15600 0.6485 9004600
0.0 277.1947 15800 0.6435 9119624
0.0 280.7080 16000 0.6527 9233904
0.0 284.2124 16200 0.6441 9351032
0.0 287.7257 16400 0.6491 9465944
0.0 291.2301 16600 0.6486 9581568
0.0 294.7434 16800 0.6558 9696576
0.0 298.2478 17000 0.6326 9811496
0.0 301.7611 17200 0.6528 9926600
0.0 305.2655 17400 0.6439 10042072
0.0 308.7788 17600 0.6413 10156616
0.0 312.2832 17800 0.6476 10272688
0.0 315.7965 18000 0.6508 10386824
0.0 319.3009 18200 0.6242 10502040
0.0 322.8142 18400 0.6602 10617608
0.0 326.3186 18600 0.6557 10731768
0.0 329.8319 18800 0.6628 10848480
0.0 333.3363 19000 0.6442 10963328
0.0 336.8496 19200 0.6539 11078712
0.0 340.3540 19400 0.6583 11193832
0.0 343.8673 19600 0.6568 11309368
0.0 347.3717 19800 0.6631 11424912
0.0 350.8850 20000 0.6575 11539864
0.0 354.3894 20200 0.6715 11654632
0.0 357.9027 20400 0.6648 11771008
0.0 361.4071 20600 0.6710 11886608
0.0 364.9204 20800 0.6896 12002608
0.0 368.4248 21000 0.6716 12117448
0.0 371.9381 21200 0.6605 12233152
0.0 375.4425 21400 0.6820 12346784
0.0 378.9558 21600 0.6826 12463336
0.0 382.4602 21800 0.6730 12578616
0.0 385.9735 22000 0.6645 12693160
0.0 389.4779 22200 0.6799 12808696
0.0 392.9912 22400 0.6723 12924056
0.0 396.4956 22600 0.6776 13039656
0.0 400.0 22800 0.6746 13154552
0.0 403.5133 23000 0.6607 13269320
0.0 407.0177 23200 0.6782 13385512
0.0 410.5310 23400 0.6866 13501208
0.0 414.0354 23600 0.6765 13617048
0.0 417.5487 23800 0.6765 13733448
0.0 421.0531 24000 0.6775 13848288
0.0 424.5664 24200 0.6669 13963536
0.0 428.0708 24400 0.6887 14080024
0.0 431.5841 24600 0.6848 14194520
0.0 435.0885 24800 0.6983 14310080
0.0 438.6018 25000 0.6968 14427448
0.0 442.1062 25200 0.7044 14542448
0.0 445.6195 25400 0.7016 14657640
0.0 449.1239 25600 0.6942 14772328
0.0 452.6372 25800 0.6956 14888712
0.0 456.1416 26000 0.6911 15002944
0.0 459.6549 26200 0.7039 15118544
0.0 463.1593 26400 0.6878 15234184
0.0 466.6726 26600 0.7102 15349544
0.0 470.1770 26800 0.6865 15465448
0.0 473.6903 27000 0.6928 15581752
0.0 477.1947 27200 0.7205 15696720
0.0 480.7080 27400 0.6875 15812864
0.0 484.2124 27600 0.7099 15928512
0.0 487.7257 27800 0.7157 16043264
0.0 491.2301 28000 0.7149 16158992
0.0 494.7434 28200 0.7344 16274040
0.0 498.2478 28400 0.7095 16389944
0.0 501.7611 28600 0.7156 16506208
0.0 505.2655 28800 0.7165 16621272
0.0 508.7788 29000 0.7254 16737072
0.0 512.2832 29200 0.7052 16852312
0.0 515.7965 29400 0.7172 16967744
0.0 519.3009 29600 0.7149 17083368
0.0 522.8142 29800 0.7228 17197984
0.0 526.3186 30000 0.7267 17314032
0.0 529.8319 30200 0.7435 17428904
0.0 533.3363 30400 0.7311 17543048
0.0 536.8496 30600 0.7221 17659880
0.0 540.3540 30800 0.7458 17773728
0.0 543.8673 31000 0.7288 17889344
0.0 547.3717 31200 0.7131 18005392
0.0 550.8850 31400 0.7189 18120296
0.0 554.3894 31600 0.7105 18235552
0.0 557.9027 31800 0.7039 18352024
0.0 561.4071 32000 0.7028 18466080
0.0 564.9204 32200 0.7278 18581584
0.0 568.4248 32400 0.7146 18697408
0.0 571.9381 32600 0.7316 18811608
0.0 575.4425 32800 0.7118 18927640
0.0 578.9558 33000 0.7212 19043672
0.0 582.4602 33200 0.7319 19157776
0.0 585.9735 33400 0.7414 19272744
0.0 589.4779 33600 0.7204 19388520
0.0 592.9912 33800 0.7302 19504472
0.0 596.4956 34000 0.7266 19618408
0.0 600.0 34200 0.7246 19734128
0.0 603.5133 34400 0.7342 19849608
0.0 607.0177 34600 0.7368 19964704
0.0 610.5310 34800 0.7222 20080968
0.0 614.0354 35000 0.7323 20195624
0.0 617.5487 35200 0.7243 20311640
0.0 621.0531 35400 0.7232 20426832
0.0 624.5664 35600 0.7158 20541816
0.0 628.0708 35800 0.7036 20656416
0.0 631.5841 36000 0.7245 20771136
0.0 635.0885 36200 0.7198 20886272
0.0 638.6018 36400 0.7242 21001560
0.0 642.1062 36600 0.7353 21115320
0.0 645.6195 36800 0.7314 21230216
0.0 649.1239 37000 0.7328 21344656
0.0 652.6372 37200 0.7173 21461664
0.0 656.1416 37400 0.7265 21576216
0.0 659.6549 37600 0.7067 21692088
0.0 663.1593 37800 0.7190 21807184
0.0 666.6726 38000 0.7271 21923192
0.0 670.1770 38200 0.7206 22037928
0.0 673.6903 38400 0.7207 22153968
0.0 677.1947 38600 0.7273 22269648
0.0 680.7080 38800 0.7390 22385640
0.0 684.2124 39000 0.7272 22502040
0.0 687.7257 39200 0.7393 22616408
0.0 691.2301 39400 0.7210 22732496
0.0 694.7434 39600 0.7333 22846704
0.0 698.2478 39800 0.7207 22962016
0.0 701.7611 40000 0.7207 23078128

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950318

Adapter
(541)
this model

Dataset used to train rbelanec/train_cb_1745950318

Evaluation results