train_openbookqa_789_1760637913

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8641
  • Num Input Tokens Seen: 8499784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6713 1.0 1116 0.6946 425448
0.7055 2.0 2232 0.6944 849792
0.7899 3.0 3348 0.6607 1274096
0.4958 4.0 4464 0.5607 1699224
0.4422 5.0 5580 0.5507 2123824
0.4713 6.0 6696 0.5406 2548008
0.4467 7.0 7812 0.5164 2972312
0.5415 8.0 8928 0.5217 3396920
0.485 9.0 10044 0.5183 3822320
0.5508 10.0 11160 0.5251 4247968
0.5128 11.0 12276 0.5607 4672584
0.3743 12.0 13392 0.5557 5097552
0.197 13.0 14508 0.5707 5522440
0.4676 14.0 15624 0.5918 5947656
0.4764 15.0 16740 0.6311 6373520
0.2852 16.0 17856 0.6932 6798816
0.3744 17.0 18972 0.7357 7223992
0.4521 18.0 20088 0.7638 7649768
0.5299 19.0 21204 0.8047 8074496
0.2345 20.0 22320 0.8125 8499784

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_789_1760637913

Adapter
(2133)
this model

Evaluation results