train_boolq_1745950278

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4582
  • Num Input Tokens Seen: 34078960

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4557 0.0943 200 0.4984 171472
0.3184 0.1886 400 0.4857 339520
0.4821 0.2829 600 0.4770 509632
0.6649 0.3772 800 0.4744 685120
0.1655 0.4715 1000 0.4742 856144
0.7329 0.5658 1200 0.4678 1024448
0.2518 0.6601 1400 0.4703 1192560
0.5007 0.7544 1600 0.4664 1360304
0.4601 0.8487 1800 0.4668 1535088
0.3674 0.9430 2000 0.4674 1708224
0.2673 1.0372 2200 0.4673 1880336
0.6585 1.1315 2400 0.4638 2048144
0.8909 1.2258 2600 0.4659 2220048
0.2237 1.3201 2800 0.4659 2388128
0.2765 1.4144 3000 0.4645 2559696
0.3239 1.5087 3200 0.4640 2730224
0.3023 1.6030 3400 0.4650 2897744
0.3181 1.6973 3600 0.4661 3067728
0.4192 1.7916 3800 0.4628 3235856
0.6724 1.8859 4000 0.4614 3409536
0.2948 1.9802 4200 0.4632 3581600
0.8057 2.0745 4400 0.4633 3753552
0.5526 2.1688 4600 0.4639 3924224
0.3203 2.2631 4800 0.4626 4093360
0.6014 2.3574 5000 0.4661 4261312
0.9379 2.4517 5200 0.4648 4438224
0.6379 2.5460 5400 0.4632 4608992
0.2241 2.6403 5600 0.4623 4780976
0.4247 2.7346 5800 0.4626 4946848
0.3434 2.8289 6000 0.4623 5120704
0.6418 2.9231 6200 0.4622 5292816
0.3682 3.0174 6400 0.4642 5463728
0.7239 3.1117 6600 0.4632 5634528
0.3848 3.2060 6800 0.4631 5805344
0.7214 3.3003 7000 0.4591 5976160
0.2775 3.3946 7200 0.4626 6147200
0.252 3.4889 7400 0.4627 6315984
0.388 3.5832 7600 0.4622 6484512
0.5885 3.6775 7800 0.4638 6653536
0.5795 3.7718 8000 0.4646 6823184
0.2884 3.8661 8200 0.4637 6991232
0.2278 3.9604 8400 0.4651 7161488
0.727 4.0547 8600 0.4615 7330416
0.5219 4.1490 8800 0.4626 7502960
0.3851 4.2433 9000 0.4636 7675776
0.5972 4.3376 9200 0.4629 7847200
0.3878 4.4319 9400 0.4632 8016080
0.6179 4.5262 9600 0.4582 8189040
0.4354 4.6205 9800 0.4645 8355360
0.2836 4.7148 10000 0.4603 8527824
0.3939 4.8091 10200 0.4631 8697376
0.3165 4.9033 10400 0.4635 8867872
0.3666 4.9976 10600 0.4620 9039888
0.5206 5.0919 10800 0.4621 9209232
0.498 5.1862 11000 0.4610 9384064
0.9534 5.2805 11200 0.4624 9555168
0.592 5.3748 11400 0.4621 9724832
0.1457 5.4691 11600 0.4631 9894608
0.3453 5.5634 11800 0.4642 10067376
0.4376 5.6577 12000 0.4625 10239648
0.6131 5.7520 12200 0.4611 10406400
0.6455 5.8463 12400 0.4635 10578176
0.4769 5.9406 12600 0.4617 10745200
0.2867 6.0349 12800 0.4614 10917056
0.2948 6.1292 13000 0.4625 11091376
0.3291 6.2235 13200 0.4619 11260160
0.4555 6.3178 13400 0.4627 11430512
0.5634 6.4121 13600 0.4641 11598992
0.5669 6.5064 13800 0.4617 11771312
0.6347 6.6007 14000 0.4616 11940256
0.2713 6.6950 14200 0.4621 12108896
0.7112 6.7893 14400 0.4642 12277824
0.6397 6.8835 14600 0.4648 12450224
0.3727 6.9778 14800 0.4654 12618544
0.4762 7.0721 15000 0.4638 12791104
0.8657 7.1664 15200 0.4635 12964976
0.553 7.2607 15400 0.4615 13132848
0.469 7.3550 15600 0.4648 13302528
0.5783 7.4493 15800 0.4605 13471696
0.669 7.5436 16000 0.4617 13643264
0.2304 7.6379 16200 0.4626 13810336
0.5105 7.7322 16400 0.4618 13980096
0.3843 7.8265 16600 0.4615 14150688
0.5705 7.9208 16800 0.4634 14320928
0.2881 8.0151 17000 0.4633 14497120
0.3272 8.1094 17200 0.4622 14667440
0.6435 8.2037 17400 0.4623 14839920
0.6967 8.2980 17600 0.4635 15013152
0.4326 8.3923 17800 0.4615 15178480
0.1335 8.4866 18000 0.4648 15349216
0.8222 8.5809 18200 0.4655 15518736
0.588 8.6752 18400 0.4618 15689568
0.6896 8.7694 18600 0.4611 15859632
0.4868 8.8637 18800 0.4645 16025920
0.3364 8.9580 19000 0.4641 16196560
0.2577 9.0523 19200 0.4617 16368144
0.2998 9.1466 19400 0.4624 16539952
0.2438 9.2409 19600 0.4632 16710384
0.4018 9.3352 19800 0.4629 16878624
0.4669 9.4295 20000 0.4601 17046992
0.536 9.5238 20200 0.4628 17218176
0.346 9.6181 20400 0.4652 17390384
0.6091 9.7124 20600 0.4649 17560512
0.5173 9.8067 20800 0.4618 17726048
0.3656 9.9010 21000 0.4649 17897376
0.3738 9.9953 21200 0.4632 18068096
0.4799 10.0896 21400 0.4635 18244704
0.4816 10.1839 21600 0.4618 18420464
0.4667 10.2782 21800 0.4636 18588480
0.2609 10.3725 22000 0.4619 18758992
0.3969 10.4668 22200 0.4612 18930688
0.444 10.5611 22400 0.4632 19096400
0.4167 10.6554 22600 0.4621 19263456
0.2856 10.7496 22800 0.4645 19430576
0.5746 10.8439 23000 0.4634 19599200
0.4028 10.9382 23200 0.4611 19770608
0.397 11.0325 23400 0.4633 19942144
1.079 11.1268 23600 0.4653 20112704
0.4449 11.2211 23800 0.4622 20282048
0.53 11.3154 24000 0.4628 20456384
0.2677 11.4097 24200 0.4596 20624672
0.4282 11.5040 24400 0.4603 20796640
0.2601 11.5983 24600 0.4639 20964288
0.3788 11.6926 24800 0.4618 21132896
0.6241 11.7869 25000 0.4627 21304080
0.4434 11.8812 25200 0.4627 21471264
0.5039 11.9755 25400 0.4653 21642432
0.8613 12.0698 25600 0.4636 21811200
1.1598 12.1641 25800 0.4654 21983552
0.5584 12.2584 26000 0.4646 22155824
0.8694 12.3527 26200 0.4675 22329984
0.2481 12.4470 26400 0.4657 22499472
0.3135 12.5413 26600 0.4616 22669904
0.5644 12.6355 26800 0.4649 22837440
0.5779 12.7298 27000 0.4621 23008384
0.7482 12.8241 27200 0.4633 23177264
0.5542 12.9184 27400 0.4646 23344128
0.6394 13.0127 27600 0.4654 23512272
0.6867 13.1070 27800 0.4623 23680080
0.3736 13.2013 28000 0.4623 23850944
0.3404 13.2956 28200 0.4623 24022624
0.8621 13.3899 28400 0.4623 24192144
0.3994 13.4842 28600 0.4623 24364768
0.4172 13.5785 28800 0.4623 24538352
0.3462 13.6728 29000 0.4646 24710416
0.7374 13.7671 29200 0.4646 24881936
0.2591 13.8614 29400 0.4646 25050656
0.3487 13.9557 29600 0.4646 25222752
0.6304 14.0500 29800 0.4646 25389344
0.3926 14.1443 30000 0.4646 25563712
0.4027 14.2386 30200 0.4646 25738992
0.597 14.3329 30400 0.4646 25909744
0.6683 14.4272 30600 0.4646 26079120
0.3154 14.5215 30800 0.4646 26245936
0.1877 14.6157 31000 0.4646 26416944
0.5388 14.7100 31200 0.4646 26586032
0.5175 14.8043 31400 0.4646 26756496
0.2724 14.8986 31600 0.4646 26924320
0.7219 14.9929 31800 0.4646 27096688
0.4891 15.0872 32000 0.4646 27264640
0.531 15.1815 32200 0.4646 27440592
0.7268 15.2758 32400 0.4646 27613248
0.8139 15.3701 32600 0.4646 27781552
0.4931 15.4644 32800 0.4646 27956496
0.3387 15.5587 33000 0.4646 28125456
0.2053 15.6530 33200 0.4646 28295136
0.3154 15.7473 33400 0.4646 28462640
0.3329 15.8416 33600 0.4646 28631360
0.706 15.9359 33800 0.4646 28799488
0.9518 16.0302 34000 0.4646 28964832
0.5958 16.1245 34200 0.4646 29137792
0.7452 16.2188 34400 0.4646 29306192
0.574 16.3131 34600 0.4646 29481760
0.6025 16.4074 34800 0.4646 29654256
0.9419 16.5017 35000 0.4646 29821840
0.3235 16.5959 35200 0.4646 29992016
0.42 16.6902 35400 0.4646 30157952
0.0703 16.7845 35600 0.4646 30329792
0.7282 16.8788 35800 0.4646 30500240
0.5623 16.9731 36000 0.4646 30668944
0.8665 17.0674 36200 0.4646 30840688
0.447 17.1617 36400 0.4646 31012176
0.8911 17.2560 36600 0.4646 31184160
0.3361 17.3503 36800 0.4646 31359648
0.4889 17.4446 37000 0.4646 31529872
0.3732 17.5389 37200 0.4646 31699104
1.0273 17.6332 37400 0.4646 31870016
0.4019 17.7275 37600 0.4646 32036672
0.3667 17.8218 37800 0.4646 32206048
0.2637 17.9161 38000 0.4646 32377104
0.3267 18.0104 38200 0.4646 32548208
0.4919 18.1047 38400 0.4646 32716560
0.5915 18.1990 38600 0.4646 32885504
0.8913 18.2933 38800 0.4646 33056256
0.4376 18.3876 39000 0.4646 33225648
0.5159 18.4818 39200 0.4646 33393952
0.5483 18.5761 39400 0.4646 33564304
0.4287 18.6704 39600 0.4646 33735024
0.2049 18.7647 39800 0.4646 33907088
0.2385 18.8590 40000 0.4646 34078960

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_1745950278

Adapter
(2100)
this model

Dataset used to train rbelanec/train_boolq_1745950278

Evaluation results