train_openbookqa_123_1760637685

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1698
  • Num Input Tokens Seen: 8496984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1254 1.0 1116 0.1698 424840
0.1343 2.0 2232 0.2142 849600
0.0788 3.0 3348 0.2310 1274392
0.0002 4.0 4464 0.3704 1699872
0.0 5.0 5580 0.3948 2125480
0.0 6.0 6696 0.5229 2551008
0.0001 7.0 7812 0.4940 2975440
0.0 8.0 8928 0.5870 3400064
0.0 9.0 10044 0.6079 3825032
0.0 10.0 11160 0.5292 4249648
0.0 11.0 12276 0.4422 4673640
0.0 12.0 13392 0.5514 5097992
0.0 13.0 14508 0.6270 5522784
0.0 14.0 15624 0.6641 5948008
0.0 15.0 16740 0.6882 6372656
0.0 16.0 17856 0.7049 6797720
0.0 17.0 18972 0.7146 7222896
0.0 18.0 20088 0.7345 7646872
0.0 19.0 21204 0.7327 8071488
0.0 20.0 22320 0.7492 8496984

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_123_1760637685

Adapter
(2105)
this model

Evaluation results