train_openbookqa_1754507498

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2460
  • Num Input Tokens Seen: 4204168

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3495 0.5 558 0.4794 210048
0.2307 1.0 1116 0.2934 420520
0.2229 1.5 1674 0.2863 630888
0.1181 2.0 2232 0.3010 841024
0.0775 2.5 2790 0.3032 1051168
0.3557 3.0 3348 0.2460 1261304
0.1723 3.5 3906 0.2663 1472152
0.0172 4.0 4464 0.2600 1682016
0.3535 4.5 5022 0.2604 1892160
0.0257 5.0 5580 0.2711 2102920
0.2185 5.5 6138 0.2833 2311976
0.1686 6.0 6696 0.2713 2523672
0.3082 6.5 7254 0.2828 2732440
0.002 7.0 7812 0.2848 2943688
0.2526 7.5 8370 0.2857 3153640
0.3551 8.0 8928 0.2850 3363864
0.3767 8.5 9486 0.2826 3574616
0.0792 9.0 10044 0.2863 3783840
0.0945 9.5 10602 0.2853 3994976
0.9347 10.0 11160 0.2851 4204168

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_1754507498

Adapter
(2105)
this model

Evaluation results