train_openbookqa_123_1760637687

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2438
  • Num Input Tokens Seen: 8496984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1819 1.0 1116 0.4116 424840
0.264 2.0 2232 0.3184 849600
0.2676 3.0 3348 0.2911 1274392
0.1714 4.0 4464 0.2719 1699872
0.3075 5.0 5580 0.2650 2125480
0.4522 6.0 6696 0.2579 2551008
0.029 7.0 7812 0.2514 2975440
0.2584 8.0 8928 0.2474 3400064
0.0924 9.0 10044 0.2464 3825032
0.8255 10.0 11160 0.2451 4249648
0.0929 11.0 12276 0.2449 4673640
0.3646 12.0 13392 0.2470 5097992
0.1922 13.0 14508 0.2465 5522784
0.238 14.0 15624 0.2456 5948008
0.3382 15.0 16740 0.2457 6372656
0.0967 16.0 17856 0.2438 6797720
0.1922 17.0 18972 0.2469 7222896
0.025 18.0 20088 0.2474 7646872
0.2137 19.0 21204 0.2476 8071488
0.2164 20.0 22320 0.2472 8496984

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_123_1760637687

Adapter
(2105)
this model

Evaluation results