train_boolq_1745950277

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1300
  • Num Input Tokens Seen: 34078960

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2949 0.0943 200 0.3250 171472
0.3203 0.1886 400 0.2047 339520
0.213 0.2829 600 0.1600 509632
0.1715 0.3772 800 0.1526 685120
0.0514 0.4715 1000 0.1652 856144
0.2019 0.5658 1200 0.1425 1024448
0.2173 0.6601 1400 0.1650 1192560
0.0799 0.7544 1600 0.1300 1360304
0.041 0.8487 1800 0.1564 1535088
0.3322 0.9430 2000 0.1519 1708224
0.0089 1.0372 2200 0.1838 1880336
0.2081 1.1315 2400 0.2094 2048144
0.38 1.2258 2600 0.2108 2220048
0.1835 1.3201 2800 0.1721 2388128
0.0032 1.4144 3000 0.2218 2559696
0.0865 1.5087 3200 0.1540 2730224
0.0565 1.6030 3400 0.1570 2897744
0.1569 1.6973 3600 0.1477 3067728
0.1551 1.7916 3800 0.1567 3235856
0.1202 1.8859 4000 0.2155 3409536
0.0059 1.9802 4200 0.1554 3581600
0.0025 2.0745 4400 0.2246 3753552
0.001 2.1688 4600 0.2449 3924224
0.006 2.2631 4800 0.2078 4093360
0.1844 2.3574 5000 0.2490 4261312
0.0018 2.4517 5200 0.2213 4438224
0.096 2.5460 5400 0.2142 4608992
0.0021 2.6403 5600 0.2055 4780976
0.0011 2.7346 5800 0.2560 4946848
0.0591 2.8289 6000 0.2815 5120704
0.1611 2.9231 6200 0.2788 5292816
0.0008 3.0174 6400 0.2730 5463728
0.0023 3.1117 6600 0.2917 5634528
0.1115 3.2060 6800 0.2372 5805344
0.1566 3.3003 7000 0.2747 5976160
0.0014 3.3946 7200 0.2818 6147200
0.0018 3.4889 7400 0.2642 6315984
0.0003 3.5832 7600 0.3310 6484512
0.0006 3.6775 7800 0.3094 6653536
0.166 3.7718 8000 0.3492 6823184
0.0006 3.8661 8200 0.2910 6991232
0.0014 3.9604 8400 0.2436 7161488
0.0043 4.0547 8600 0.3267 7330416
0.0002 4.1490 8800 0.4208 7502960
0.0002 4.2433 9000 0.3043 7675776
0.0008 4.3376 9200 0.3460 7847200
0.0001 4.4319 9400 0.4261 8016080
0.1008 4.5262 9600 0.2292 8189040
0.0001 4.6205 9800 0.3800 8355360
0.0002 4.7148 10000 0.3418 8527824
0.0004 4.8091 10200 0.2841 8697376
0.0012 4.9033 10400 0.2540 8867872
0.0013 4.9976 10600 0.3265 9039888
0.1067 5.0919 10800 0.4539 9209232
0.0 5.1862 11000 0.5187 9384064
0.0023 5.2805 11200 0.3918 9555168
0.0 5.3748 11400 0.4749 9724832
0.0001 5.4691 11600 0.3953 9894608
0.1221 5.5634 11800 0.3914 10067376
0.0015 5.6577 12000 0.6157 10239648
0.0001 5.7520 12200 0.4068 10406400
0.0001 5.8463 12400 0.3948 10578176
0.0009 5.9406 12600 0.4626 10745200
0.0 6.0349 12800 0.4054 10917056
0.0 6.1292 13000 0.4441 11091376
0.0038 6.2235 13200 0.3147 11260160
0.0001 6.3178 13400 0.3441 11430512
0.0 6.4121 13600 0.5056 11598992
0.0 6.5064 13800 0.4643 11771312
0.0 6.6007 14000 0.4942 11940256
0.0 6.6950 14200 0.5800 12108896
0.0 6.7893 14400 0.5291 12277824
0.0 6.8835 14600 0.4069 12450224
0.0004 6.9778 14800 0.3468 12618544
0.4969 7.0721 15000 0.4180 12791104
0.0 7.1664 15200 0.4630 12964976
0.0 7.2607 15400 0.4689 13132848
0.0 7.3550 15600 0.4131 13302528
0.0 7.4493 15800 0.5349 13471696
0.0 7.5436 16000 0.4500 13643264
0.0 7.6379 16200 0.3913 13810336
0.0002 7.7322 16400 0.4377 13980096
0.0 7.8265 16600 0.5126 14150688
0.0001 7.9208 16800 0.4151 14320928
0.0 8.0151 17000 0.4575 14497120
0.0001 8.1094 17200 0.4715 14667440
0.0001 8.2037 17400 0.3730 14839920
0.0001 8.2980 17600 0.4248 15013152
0.0 8.3923 17800 0.5033 15178480
0.0 8.4866 18000 0.5056 15349216
0.0 8.5809 18200 0.5075 15518736
0.0 8.6752 18400 0.4976 15689568
0.0011 8.7694 18600 0.3954 15859632
0.0 8.8637 18800 0.4307 16025920
0.0001 8.9580 19000 0.3383 16196560
0.0 9.0523 19200 0.5251 16368144
0.0 9.1466 19400 0.6275 16539952
0.0 9.2409 19600 0.6473 16710384
0.0 9.3352 19800 0.4718 16878624
0.0 9.4295 20000 0.5149 17046992
0.0 9.5238 20200 0.5534 17218176
0.0 9.6181 20400 0.6049 17390384
0.0 9.7124 20600 0.6236 17560512
0.0 9.8067 20800 0.4179 17726048
0.0 9.9010 21000 0.4773 17897376
0.0001 9.9953 21200 0.4315 18068096
0.0 10.0896 21400 0.4717 18244704
0.0 10.1839 21600 0.4507 18420464
0.0 10.2782 21800 0.5002 18588480
0.0 10.3725 22000 0.4912 18758992
0.0 10.4668 22200 0.5160 18930688
0.0 10.5611 22400 0.5553 19096400
0.0 10.6554 22600 0.4440 19263456
0.0 10.7496 22800 0.5174 19430576
0.0001 10.8439 23000 0.4635 19599200
0.0 10.9382 23200 0.4561 19770608
0.0 11.0325 23400 0.5011 19942144
0.0 11.1268 23600 0.5298 20112704
0.0 11.2211 23800 0.5707 20282048
0.0 11.3154 24000 0.5005 20456384
0.0 11.4097 24200 0.4854 20624672
0.0 11.5040 24400 0.5311 20796640
0.0 11.5983 24600 0.5520 20964288
0.0 11.6926 24800 0.6044 21132896
0.0 11.7869 25000 0.5289 21304080
0.0 11.8812 25200 0.5572 21471264
0.0 11.9755 25400 0.5636 21642432
0.0 12.0698 25600 0.5867 21811200
0.0 12.1641 25800 0.5944 21983552
0.0 12.2584 26000 0.5938 22155824
0.0 12.3527 26200 0.6187 22329984
0.0 12.4470 26400 0.4904 22499472
0.0 12.5413 26600 0.5263 22669904
0.0 12.6355 26800 0.5404 22837440
0.0 12.7298 27000 0.5808 23008384
0.0 12.8241 27200 0.5923 23177264
0.0001 12.9184 27400 0.6002 23344128
0.0 13.0127 27600 0.6057 23512272
0.0 13.1070 27800 0.6309 23680080
0.0 13.2013 28000 0.6362 23850944
0.0 13.2956 28200 0.6412 24022624
0.0 13.3899 28400 0.6448 24192144
0.0 13.4842 28600 0.6528 24364768
0.0 13.5785 28800 0.5868 24538352
0.0 13.6728 29000 0.5990 24710416
0.0 13.7671 29200 0.6152 24881936
0.0 13.8614 29400 0.6230 25050656
0.0 13.9557 29600 0.6278 25222752
0.0 14.0500 29800 0.6313 25389344
0.0 14.1443 30000 0.6579 25563712
0.0 14.2386 30200 0.6643 25738992
0.0 14.3329 30400 0.6687 25909744
0.0 14.4272 30600 0.6975 26079120
0.0 14.5215 30800 0.7041 26245936
0.0 14.6157 31000 0.7109 26416944
0.0 14.7100 31200 0.7532 26586032
0.0 14.8043 31400 0.7551 26756496
0.0 14.8986 31600 0.7506 26924320
0.0 14.9929 31800 0.7554 27096688
0.0 15.0872 32000 0.7591 27264640
0.0 15.1815 32200 0.7616 27440592
0.0 15.2758 32400 0.7617 27613248
0.0 15.3701 32600 0.7620 27781552
0.0 15.4644 32800 0.7669 27956496
0.0 15.5587 33000 0.7672 28125456
0.0 15.6530 33200 0.7705 28295136
0.0 15.7473 33400 0.7677 28462640
0.0 15.8416 33600 0.7741 28631360
0.0 15.9359 33800 0.7735 28799488
0.0 16.0302 34000 0.7718 28964832
0.0 16.1245 34200 0.7772 29137792
0.0 16.2188 34400 0.7773 29306192
0.0 16.3131 34600 0.7795 29481760
0.0 16.4074 34800 0.7860 29654256
0.0 16.5017 35000 0.7843 29821840
0.0 16.5959 35200 0.7870 29992016
0.0 16.6902 35400 0.7885 30157952
0.0 16.7845 35600 0.7890 30329792
0.0 16.8788 35800 0.7849 30500240
0.0 16.9731 36000 0.7907 30668944
0.0 17.0674 36200 0.7922 30840688
0.0 17.1617 36400 0.7952 31012176
0.0 17.2560 36600 0.7901 31184160
0.0 17.3503 36800 0.7919 31359648
0.0 17.4446 37000 0.7899 31529872
0.0 17.5389 37200 0.7934 31699104
0.0 17.6332 37400 0.7909 31870016
0.0 17.7275 37600 0.7962 32036672
0.0 17.8218 37800 0.7975 32206048
0.0 17.9161 38000 0.7980 32377104
0.0 18.0104 38200 0.7977 32548208
0.0 18.1047 38400 0.8014 32716560
0.0 18.1990 38600 0.7976 32885504
0.0 18.2933 38800 0.7980 33056256
0.0 18.3876 39000 0.7969 33225648
0.0 18.4818 39200 0.8011 33393952
0.0 18.5761 39400 0.8007 33564304
0.0 18.6704 39600 0.8016 33735024
0.0 18.7647 39800 0.8004 33907088
0.0 18.8590 40000 0.7987 34078960

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_1745950277

Adapter
(2100)
this model

Dataset used to train rbelanec/train_boolq_1745950277

Evaluation results