train_openbookqa_1756729615

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1681
  • Num Input Tokens Seen: 3935016

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.733 0.5002 1116 0.7069 196496
0.7148 1.0004 2232 0.6999 393464
0.6822 1.5007 3348 0.7518 589992
0.5996 2.0009 4464 0.6226 787056
0.5202 2.5011 5580 0.6730 984096
0.5372 3.0013 6696 0.6052 1180920
0.584 3.5016 7812 0.6233 1378312
0.5433 4.0018 8928 0.6500 1574976
0.7427 4.5020 10044 0.5413 1772096
0.2283 5.0022 11160 0.6870 1969288
0.4506 5.5025 12276 0.7398 2165240
0.877 6.0027 13392 0.6110 2362584
0.3696 6.5029 14508 0.7791 2558168
0.5637 7.0031 15624 0.8726 2756072
0.62 7.5034 16740 0.9139 2952520
0.0061 8.0036 17856 0.8685 3149560
0.7385 8.5038 18972 1.0582 3347080
0.0037 9.0040 20088 1.0473 3543488
0.0546 9.5043 21204 1.1628 3741120

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_1756729615

Adapter
(2104)
this model

Evaluation results