train_openbookqa_1755694507

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7263
  • Num Input Tokens Seen: 3935016

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7046 0.5002 1116 0.7015 196496
0.7045 1.0004 2232 0.7008 393464
0.6817 1.5007 3348 0.7119 589992
0.6974 2.0009 4464 0.6949 787056
0.6984 2.5011 5580 0.6978 984096
0.6421 3.0013 6696 0.7007 1180920
0.6968 3.5016 7812 0.6950 1378312
0.6728 4.0018 8928 0.6948 1574976
0.6908 4.5020 10044 0.9289 1772096
0.6442 5.0022 11160 0.6616 1969288
0.5868 5.5025 12276 0.6543 2165240
0.6737 6.0027 13392 0.5839 2362584
0.4501 6.5029 14508 0.5840 2558168
0.5469 7.0031 15624 0.5781 2756072
0.5315 7.5034 16740 0.6050 2952520
0.4052 8.0036 17856 0.5918 3149560
0.9231 8.5038 18972 0.6392 3347080
0.1328 9.0040 20088 0.6744 3543488
0.7252 9.5043 21204 0.7036 3741120

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_1755694507

Adapter
(2124)
this model

Evaluation results