train_cb_1745950312

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1586
  • Num Input Tokens Seen: 22164464

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.284 3.5133 200 0.1743 111736
0.0782 7.0177 400 0.1610 223024
0.1338 10.5310 600 0.1586 332984
0.0725 14.0354 800 0.1596 444576
0.0814 17.5487 1000 0.1621 555960
0.0691 21.0531 1200 0.1672 665952
0.0118 24.5664 1400 0.1699 777608
0.133 28.0708 1600 0.1807 887904
0.0241 31.5841 1800 0.1871 999464
0.0245 35.0885 2000 0.2026 1110640
0.0097 38.6018 2200 0.2195 1222144
0.0193 42.1062 2400 0.2402 1332096
0.0101 45.6195 2600 0.2672 1443792
0.0153 49.1239 2800 0.2882 1553600
0.0024 52.6372 3000 0.3065 1664296
0.0035 56.1416 3200 0.3406 1775264
0.0014 59.6549 3400 0.3585 1885968
0.0002 63.1593 3600 0.3739 1996440
0.0011 66.6726 3800 0.3880 2107400
0.0002 70.1770 4000 0.3887 2218352
0.0005 73.6903 4200 0.3966 2330072
0.0006 77.1947 4400 0.4150 2440176
0.0002 80.7080 4600 0.3956 2551216
0.0002 84.2124 4800 0.4218 2662848
0.0001 87.7257 5000 0.4170 2774160
0.0001 91.2301 5200 0.4206 2885448
0.0001 94.7434 5400 0.4394 2995680
0.0001 98.2478 5600 0.4445 3106768
0.0002 101.7611 5800 0.4561 3218248
0.0001 105.2655 6000 0.4435 3329176
0.0002 108.7788 6200 0.4605 3440344
0.0001 112.2832 6400 0.4850 3550560
0.0001 115.7965 6600 0.4710 3661824
0.0 119.3009 6800 0.4757 3771856
0.0001 122.8142 7000 0.4788 3883176
0.0001 126.3186 7200 0.4710 3994264
0.0 129.8319 7400 0.4824 4105440
0.0001 133.3363 7600 0.4898 4216208
0.0 136.8496 7800 0.4831 4326832
0.0 140.3540 8000 0.4945 4437792
0.0 143.8673 8200 0.4983 4549512
0.0 147.3717 8400 0.4865 4658800
0.0 150.8850 8600 0.4894 4769400
0.0 154.3894 8800 0.5232 4881880
0.0 157.9027 9000 0.5032 4992488
0.0 161.4071 9200 0.5058 5103032
0.0 164.9204 9400 0.5299 5214280
0.0 168.4248 9600 0.5226 5323664
0.0 171.9381 9800 0.5231 5436384
0.0 175.4425 10000 0.5379 5547152
0.0 178.9558 10200 0.5326 5658656
0.0 182.4602 10400 0.5466 5768616
0.0 185.9735 10600 0.5473 5879304
0.0 189.4779 10800 0.5319 5990296
0.0 192.9912 11000 0.5413 6101128
0.0 196.4956 11200 0.5279 6212008
0.0 200.0 11400 0.5467 6321568
0.0 203.5133 11600 0.5459 6432384
0.0 207.0177 11800 0.5572 6542352
0.0 210.5310 12000 0.5527 6654160
0.0 214.0354 12200 0.5457 6765224
0.0 217.5487 12400 0.5507 6874936
0.0 221.0531 12600 0.5711 6986248
0.0 224.5664 12800 0.5727 7097808
0.0 228.0708 13000 0.5716 7208392
0.0 231.5841 13200 0.5790 7318456
0.0 235.0885 13400 0.5775 7430160
0.0 238.6018 13600 0.5793 7540344
0.0 242.1062 13800 0.5663 7650824
0.0 245.6195 14000 0.5732 7761968
0.0 249.1239 14200 0.5944 7872968
0.0 252.6372 14400 0.6055 7983464
0.0 256.1416 14600 0.5987 8093616
0.0 259.6549 14800 0.5991 8204560
0.0 263.1593 15000 0.5862 8315912
0.0 266.6726 15200 0.5794 8426448
0.0 270.1770 15400 0.5985 8536288
0.0 273.6903 15600 0.6050 8648256
0.0 277.1947 15800 0.6189 8758760
0.0 280.7080 16000 0.6261 8868600
0.0 284.2124 16200 0.6282 8981000
0.0 287.7257 16400 0.6583 9091424
0.0 291.2301 16600 0.6430 9202432
0.0 294.7434 16800 0.6544 9312888
0.0 298.2478 17000 0.6434 9423320
0.0 301.7611 17200 0.6714 9533896
0.0 305.2655 17400 0.6431 9644952
0.0 308.7788 17600 0.6493 9754832
0.0 312.2832 17800 0.6749 9866256
0.0 315.7965 18000 0.6496 9975768
0.0 319.3009 18200 0.6726 10086392
0.0 322.8142 18400 0.6718 10197432
0.0 326.3186 18600 0.6865 10307224
0.0 329.8319 18800 0.6698 10419256
0.0 333.3363 19000 0.6498 10529488
0.0 336.8496 19200 0.6796 10640296
0.0 340.3540 19400 0.6784 10750776
0.0 343.8673 19600 0.6566 10861648
0.0 347.3717 19800 0.6681 10972808
0.0 350.8850 20000 0.6887 11083136
0.0 354.3894 20200 0.7147 11193448
0.0 357.9027 20400 0.6921 11305168
0.0 361.4071 20600 0.7121 11416112
0.0 364.9204 20800 0.6977 11527424
0.0 368.4248 21000 0.7004 11637784
0.0 371.9381 21200 0.7117 11748768
0.0 375.4425 21400 0.7038 11857872
0.0 378.9558 21600 0.6942 11969696
0.0 382.4602 21800 0.7161 12080592
0.0 385.9735 22000 0.7295 12190608
0.0 389.4779 22200 0.7190 12301648
0.0 392.9912 22400 0.7184 12412384
0.0 396.4956 22600 0.7380 12523264
0.0 400.0 22800 0.7235 12633656
0.0 403.5133 23000 0.7182 12743928
0.0 407.0177 23200 0.7180 12855568
0.0 410.5310 23400 0.7378 12966544
0.0 414.0354 23600 0.7213 13077752
0.0 417.5487 23800 0.7396 13189592
0.0 421.0531 24000 0.7409 13299920
0.0 424.5664 24200 0.7202 13410872
0.0 428.0708 24400 0.7344 13522656
0.0 431.5841 24600 0.7564 13632696
0.0 435.0885 24800 0.6867 13743576
0.0 438.6018 25000 0.7655 13856080
0.0 442.1062 25200 0.7144 13966552
0.0 445.6195 25400 0.7624 14076912
0.0 449.1239 25600 0.7328 14187144
0.0 452.6372 25800 0.7431 14298896
0.0 456.1416 26000 0.7328 14408592
0.0 459.6549 26200 0.7600 14519672
0.0 463.1593 26400 0.7228 14630736
0.0 466.6726 26600 0.7296 14741472
0.0 470.1770 26800 0.7222 14852816
0.0 473.6903 27000 0.7612 14964568
0.0 477.1947 27200 0.7532 15074912
0.0 480.7080 27400 0.7368 15186488
0.0 484.2124 27600 0.7430 15297600
0.0 487.7257 27800 0.7272 15407784
0.0 491.2301 28000 0.7539 15518800
0.0 494.7434 28200 0.7698 15629392
0.0 498.2478 28400 0.7498 15740552
0.0 501.7611 28600 0.7707 15852112
0.0 505.2655 28800 0.7634 15962600
0.0 508.7788 29000 0.7678 16073896
0.0 512.2832 29200 0.7427 16184680
0.0 515.7965 29400 0.7719 16295584
0.0 519.3009 29600 0.7325 16406536
0.0 522.8142 29800 0.7953 16516648
0.0 526.3186 30000 0.7460 16628144
0.0 529.8319 30200 0.7134 16738416
0.0 533.3363 30400 0.7632 16848080
0.0 536.8496 30600 0.7161 16960312
0.0 540.3540 30800 0.7365 17069536
0.0 543.8673 31000 0.7271 17180696
0.0 547.3717 31200 0.7417 17291896
0.0 550.8850 31400 0.7391 17402176
0.0 554.3894 31600 0.7218 17512704
0.0 557.9027 31800 0.7414 17624600
0.0 561.4071 32000 0.7245 17734208
0.0 564.9204 32200 0.7525 17845224
0.0 568.4248 32400 0.7680 17956288
0.0 571.9381 32600 0.7673 18066176
0.0 575.4425 32800 0.7447 18177520
0.0 578.9558 33000 0.7571 18289064
0.0 582.4602 33200 0.7178 18398888
0.0 585.9735 33400 0.7572 18509416
0.0 589.4779 33600 0.7605 18620544
0.0 592.9912 33800 0.7580 18731712
0.0 596.4956 34000 0.7632 18841128
0.0 600.0 34200 0.7505 18952336
0.0 603.5133 34400 0.7474 19063208
0.0 607.0177 34600 0.7527 19173736
0.0 610.5310 34800 0.7446 19285352
0.0 614.0354 35000 0.7091 19395536
0.0 617.5487 35200 0.7482 19506864
0.0 621.0531 35400 0.7423 19617648
0.0 624.5664 35600 0.7325 19728144
0.0 628.0708 35800 0.7527 19838296
0.0 631.5841 36000 0.7241 19948392
0.0 635.0885 36200 0.7680 20059232
0.0 638.6018 36400 0.7430 20170032
0.0 642.1062 36600 0.7420 20279560
0.0 645.6195 36800 0.7323 20389936
0.0 649.1239 37000 0.7757 20499984
0.0 652.6372 37200 0.7163 20612176
0.0 656.1416 37400 0.7300 20722344
0.0 659.6549 37600 0.7375 20833640
0.0 663.1593 37800 0.7191 20944256
0.0 666.6726 38000 0.7308 21055624
0.0 670.1770 38200 0.7359 21165744
0.0 673.6903 38400 0.7463 21277072
0.0 677.1947 38600 0.7771 21388128
0.0 680.7080 38800 0.7464 21499624
0.0 684.2124 39000 0.7472 21611240
0.0 687.7257 39200 0.7426 21721232
0.0 691.2301 39400 0.7426 21832720
0.0 694.7434 39600 0.7426 21942280
0.0 698.2478 39800 0.7426 22053128
0.0 701.7611 40000 0.7426 22164464

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950312

Adapter
(2100)
this model

Dataset used to train rbelanec/train_cb_1745950312

Evaluation results