train_openbookqa_456_1760637799

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1821
  • Num Input Tokens Seen: 8491296

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2752 1.0 1116 0.1821 424016
0.0268 2.0 2232 0.1870 848912
0.3734 3.0 3348 0.2193 1273224
0.0005 4.0 4464 0.2685 1698344
0.0002 5.0 5580 0.2791 2122168
0.0001 6.0 6696 0.3841 2547696
0.0 7.0 7812 0.5529 2971624
0.0 8.0 8928 0.4719 3395952
0.0002 9.0 10044 0.3130 3821456
0.0 10.0 11160 0.5053 4245816
0.0 11.0 12276 0.6839 4670528
0.0 12.0 13392 0.5244 5093960
0.0 13.0 14508 0.5028 5519312
0.0 14.0 15624 0.5612 5944728
0.0 15.0 16740 0.6089 6369088
0.0 16.0 17856 0.6385 6792536
0.0 17.0 18972 0.6398 7217384
0.0 18.0 20088 0.6500 7642368
0.0 19.0 21204 0.6589 8066544
0.0 20.0 22320 0.6591 8491296

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_456_1760637799

Adapter
(2124)
this model

Evaluation results