train_openbookqa_456_1760637798

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5575
  • Num Input Tokens Seen: 8491296

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6949 1.0 1116 0.6979 424016
0.6875 2.0 2232 0.6954 848912
0.675 3.0 3348 0.6899 1273224
0.6944 4.0 4464 0.6573 1698344
0.6382 5.0 5580 0.6229 2122168
0.5662 6.0 6696 0.5699 2547696
0.4895 7.0 7812 0.5629 2971624
0.5742 8.0 8928 0.5518 3395952
0.5811 9.0 10044 0.5386 3821456
0.4552 10.0 11160 0.5187 4245816
0.4953 11.0 12276 0.5347 4670528
0.3379 12.0 13392 0.5403 5093960
0.4609 13.0 14508 0.5208 5519312
0.5553 14.0 15624 0.5298 5944728
0.348 15.0 16740 0.5542 6369088
0.4613 16.0 17856 0.5514 6792536
0.4023 17.0 18972 0.5767 7217384
0.4078 18.0 20088 0.5874 7642368
0.4064 19.0 21204 0.5990 8066544
0.6209 20.0 22320 0.5977 8491296

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_456_1760637798

Adapter
(2105)
this model

Evaluation results