train_openbookqa_789_1760637915

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5495
  • Num Input Tokens Seen: 8499784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.9688 1.0 1116 0.5587 425448
0.435 2.0 2232 0.5571 849792
0.168 3.0 3348 0.5513 1274096
0.1365 4.0 4464 0.5498 1699224
0.3142 5.0 5580 0.5508 2123824
0.7252 6.0 6696 0.5502 2548008
0.5191 7.0 7812 0.5496 2972312
1.3642 8.0 8928 0.5503 3396920
0.704 9.0 10044 0.5495 3822320
0.0703 10.0 11160 0.5497 4247968
0.2012 11.0 12276 0.5533 4672584
0.3404 12.0 13392 0.5516 5097552
0.6072 13.0 14508 0.5498 5522440
0.8546 14.0 15624 0.5522 5947656
0.5271 15.0 16740 0.5497 6373520
1.0104 16.0 17856 0.5496 6798816
0.0885 17.0 18972 0.5497 7223992
0.4305 18.0 20088 0.5537 7649768
0.2571 19.0 21204 0.5516 8074496
0.2121 20.0 22320 0.5516 8499784

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_789_1760637915

Adapter
(2105)
this model

Evaluation results