train_cb_1745950316

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0173
  • Num Input Tokens Seen: 22164464

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.3945 3.5133 200 1.1587 111736
1.0181 7.0177 400 1.0771 223024
1.237 10.5310 600 1.0586 332984
1.1874 14.0354 800 1.0447 444576
1.0634 17.5487 1000 1.0482 555960
1.0207 21.0531 1200 1.0313 665952
0.8781 24.5664 1400 1.0357 777608
1.1403 28.0708 1600 1.0395 887904
1.2085 31.5841 1800 1.0357 999464
1.01 35.0885 2000 1.0369 1110640
1.0684 38.6018 2200 1.0346 1222144
1.1009 42.1062 2400 1.0411 1332096
1.0705 45.6195 2600 1.0365 1443792
1.0641 49.1239 2800 1.0337 1553600
1.0718 52.6372 3000 1.0374 1664296
1.0614 56.1416 3200 1.0266 1775264
1.0873 59.6549 3400 1.0374 1885968
0.9122 63.1593 3600 1.0292 1996440
1.1853 66.6726 3800 1.0413 2107400
0.9562 70.1770 4000 1.0335 2218352
0.9934 73.6903 4200 1.0307 2330072
1.2125 77.1947 4400 1.0342 2440176
1.0047 80.7080 4600 1.0330 2551216
1.0866 84.2124 4800 1.0258 2662848
0.9433 87.7257 5000 1.0384 2774160
1.1497 91.2301 5200 1.0299 2885448
1.0695 94.7434 5400 1.0259 2995680
1.1854 98.2478 5600 1.0193 3106768
1.1601 101.7611 5800 1.0228 3218248
0.9279 105.2655 6000 1.0301 3329176
1.193 108.7788 6200 1.0312 3440344
1.0355 112.2832 6400 1.0390 3550560
0.9656 115.7965 6600 1.0291 3661824
0.8823 119.3009 6800 1.0280 3771856
1.2007 122.8142 7000 1.0319 3883176
1.1485 126.3186 7200 1.0297 3994264
1.0637 129.8319 7400 1.0336 4105440
1.1949 133.3363 7600 1.0211 4216208
1.1562 136.8496 7800 1.0274 4326832
1.137 140.3540 8000 1.0349 4437792
1.1652 143.8673 8200 1.0363 4549512
1.1128 147.3717 8400 1.0290 4658800
1.0173 150.8850 8600 1.0292 4769400
0.9819 154.3894 8800 1.0317 4881880
1.1285 157.9027 9000 1.0267 4992488
1.2546 161.4071 9200 1.0334 5103032
0.9652 164.9204 9400 1.0268 5214280
1.0387 168.4248 9600 1.0264 5323664
1.0908 171.9381 9800 1.0306 5436384
1.0437 175.4425 10000 1.0295 5547152
0.9347 178.9558 10200 1.0254 5658656
1.1187 182.4602 10400 1.0260 5768616
1.1472 185.9735 10600 1.0325 5879304
1.0755 189.4779 10800 1.0416 5990296
1.1533 192.9912 11000 1.0258 6101128
1.1898 196.4956 11200 1.0284 6212008
1.0763 200.0 11400 1.0305 6321568
1.0221 203.5133 11600 1.0273 6432384
0.8903 207.0177 11800 1.0349 6542352
1.0844 210.5310 12000 1.0294 6654160
1.1491 214.0354 12200 1.0173 6765224
0.9916 217.5487 12400 1.0221 6874936
1.0089 221.0531 12600 1.0246 6986248
0.9634 224.5664 12800 1.0277 7097808
1.348 228.0708 13000 1.0306 7208392
1.1051 231.5841 13200 1.0327 7318456
1.0819 235.0885 13400 1.0266 7430160
1.094 238.6018 13600 1.0254 7540344
1.0909 242.1062 13800 1.0268 7650824
1.1066 245.6195 14000 1.0255 7761968
0.8353 249.1239 14200 1.0351 7872968
0.9608 252.6372 14400 1.0306 7983464
0.9857 256.1416 14600 1.0303 8093616
1.0436 259.6549 14800 1.0242 8204560
1.0435 263.1593 15000 1.0297 8315912
1.0667 266.6726 15200 1.0315 8426448
1.046 270.1770 15400 1.0311 8536288
0.9525 273.6903 15600 1.0307 8648256
1.1752 277.1947 15800 1.0300 8758760
0.9125 280.7080 16000 1.0274 8868600
1.0887 284.2124 16200 1.0315 8981000
1.1229 287.7257 16400 1.0302 9091424
0.9959 291.2301 16600 1.0355 9202432
1.1444 294.7434 16800 1.0290 9312888
1.1921 298.2478 17000 1.0300 9423320
1.1624 301.7611 17200 1.0266 9533896
1.1738 305.2655 17400 1.0265 9644952
0.8876 308.7788 17600 1.0250 9754832
0.9924 312.2832 17800 1.0315 9866256
0.8758 315.7965 18000 1.0255 9975768
0.9882 319.3009 18200 1.0311 10086392
1.0167 322.8142 18400 1.0330 10197432
0.849 326.3186 18600 1.0196 10307224
0.9223 329.8319 18800 1.0275 10419256
0.9825 333.3363 19000 1.0258 10529488
1.0664 336.8496 19200 1.0257 10640296
1.022 340.3540 19400 1.0266 10750776
0.9859 343.8673 19600 1.0315 10861648
1.1735 347.3717 19800 1.0318 10972808
0.9339 350.8850 20000 1.0205 11083136
1.0463 354.3894 20200 1.0252 11193448
1.0393 357.9027 20400 1.0223 11305168
0.8809 361.4071 20600 1.0300 11416112
1.1824 364.9204 20800 1.0258 11527424
0.8426 368.4248 21000 1.0242 11637784
1.1234 371.9381 21200 1.0326 11748768
1.1642 375.4425 21400 1.0289 11857872
1.027 378.9558 21600 1.0271 11969696
1.0781 382.4602 21800 1.0212 12080592
1.1651 385.9735 22000 1.0315 12190608
1.0961 389.4779 22200 1.0217 12301648
1.0122 392.9912 22400 1.0289 12412384
1.0517 396.4956 22600 1.0287 12523264
0.9128 400.0 22800 1.0313 12633656
1.134 403.5133 23000 1.0304 12743928
0.6801 407.0177 23200 1.0217 12855568
1.2031 410.5310 23400 1.0217 12966544
0.9253 414.0354 23600 1.0304 13077752
1.0901 417.5487 23800 1.0249 13189592
0.8616 421.0531 24000 1.0304 13299920
0.9668 424.5664 24200 1.0241 13410872
1.0 428.0708 24400 1.0349 13522656
1.1078 431.5841 24600 1.0263 13632696
1.2902 435.0885 24800 1.0263 13743576
1.107 438.6018 25000 1.0241 13856080
1.0065 442.1062 25200 1.0304 13966552
0.8437 445.6195 25400 1.0311 14076912
1.005 449.1239 25600 1.0320 14187144
0.9998 452.6372 25800 1.0320 14298896
0.9302 456.1416 26000 1.0287 14408592
1.1934 459.6549 26200 1.0287 14519672
1.0052 463.1593 26400 1.0287 14630736
0.9753 466.6726 26600 1.0287 14741472
1.019 470.1770 26800 1.0287 14852816
1.0584 473.6903 27000 1.0287 14964568
0.9989 477.1947 27200 1.0287 15074912
1.1818 480.7080 27400 1.0287 15186488
1.2094 484.2124 27600 1.0287 15297600
0.9714 487.7257 27800 1.0287 15407784
0.9373 491.2301 28000 1.0287 15518800
1.0237 494.7434 28200 1.0311 15629392
0.997 498.2478 28400 1.0311 15740552
1.062 501.7611 28600 1.0311 15852112
1.2497 505.2655 28800 1.0311 15962600
0.85 508.7788 29000 1.0311 16073896
1.0928 512.2832 29200 1.0311 16184680
1.1411 515.7965 29400 1.0311 16295584
1.1284 519.3009 29600 1.0311 16406536
0.8901 522.8142 29800 1.0311 16516648
1.0365 526.3186 30000 1.0311 16628144
1.0249 529.8319 30200 1.0311 16738416
1.0122 533.3363 30400 1.0311 16848080
1.2131 536.8496 30600 1.0311 16960312
0.9375 540.3540 30800 1.0311 17069536
0.8701 543.8673 31000 1.0311 17180696
1.0842 547.3717 31200 1.0311 17291896
1.0571 550.8850 31400 1.0311 17402176
0.9075 554.3894 31600 1.0311 17512704
0.9215 557.9027 31800 1.0311 17624600
1.0647 561.4071 32000 1.0311 17734208
1.1348 564.9204 32200 1.0311 17845224
1.1134 568.4248 32400 1.0311 17956288
1.0489 571.9381 32600 1.0311 18066176
1.0341 575.4425 32800 1.0311 18177520
1.2816 578.9558 33000 1.0311 18289064
1.219 582.4602 33200 1.0311 18398888
1.1362 585.9735 33400 1.0311 18509416
1.0518 589.4779 33600 1.0311 18620544
1.0665 592.9912 33800 1.0311 18731712
0.991 596.4956 34000 1.0311 18841128
0.9105 600.0 34200 1.0311 18952336
0.9019 603.5133 34400 1.0311 19063208
1.1367 607.0177 34600 1.0311 19173736
1.007 610.5310 34800 1.0311 19285352
1.0601 614.0354 35000 1.0311 19395536
0.8663 617.5487 35200 1.0311 19506864
0.9241 621.0531 35400 1.0311 19617648
0.9727 624.5664 35600 1.0311 19728144
1.1257 628.0708 35800 1.0311 19838296
0.9849 631.5841 36000 1.0311 19948392
0.921 635.0885 36200 1.0311 20059232
0.9354 638.6018 36400 1.0311 20170032
1.1421 642.1062 36600 1.0311 20279560
0.9622 645.6195 36800 1.0311 20389936
1.0777 649.1239 37000 1.0311 20499984
0.9475 652.6372 37200 1.0311 20612176
1.3035 656.1416 37400 1.0311 20722344
0.9743 659.6549 37600 1.0311 20833640
1.2551 663.1593 37800 1.0311 20944256
0.7373 666.6726 38000 1.0311 21055624
1.0018 670.1770 38200 1.0311 21165744
0.9958 673.6903 38400 1.0311 21277072
1.1756 677.1947 38600 1.0311 21388128
1.0102 680.7080 38800 1.0311 21499624
0.9521 684.2124 39000 1.0311 21611240
1.0948 687.7257 39200 1.0311 21721232
0.9525 691.2301 39400 1.0311 21832720
1.0582 694.7434 39600 1.0311 21942280
1.0302 698.2478 39800 1.0311 22053128
0.9108 701.7611 40000 1.0311 22164464

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950316

Adapter
(2100)
this model

Dataset used to train rbelanec/train_cb_1745950316

Evaluation results