train_openbookqa_789_1760637914

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1975
  • Num Input Tokens Seen: 8499784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5986 1.0 1116 0.2197 425448
0.0952 2.0 2232 0.1975 849792
0.0028 3.0 3348 0.2483 1274096
0.002 4.0 4464 0.2404 1699224
0.0001 5.0 5580 0.4181 2123824
0.0003 6.0 6696 0.3511 2548008
0.0002 7.0 7812 0.3808 2972312
0.0 8.0 8928 0.5220 3396920
0.0023 9.0 10044 0.4886 3822320
0.0 10.0 11160 0.6215 4247968
0.0 11.0 12276 0.4707 4672584
0.0 12.0 13392 0.5171 5097552
0.0 13.0 14508 0.5367 5522440
0.0 14.0 15624 0.5588 5947656
0.0 15.0 16740 0.5737 6373520
0.0 16.0 17856 0.5839 6798816
0.0 17.0 18972 0.5960 7223992
0.0 18.0 20088 0.6117 7649768
0.0 19.0 21204 0.6109 8074496
0.0 20.0 22320 0.6127 8499784

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_789_1760637914

Adapter
(2133)
this model

Evaluation results