train_openbookqa_456_1760637800

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4463
  • Num Input Tokens Seen: 8491296

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5578 1.0 1116 0.4590 424016
0.4861 2.0 2232 0.4513 848912
0.8579 3.0 3348 0.4525 1273224
0.4859 4.0 4464 0.4496 1698344
0.5608 5.0 5580 0.4504 2122168
0.3947 6.0 6696 0.4495 2547696
0.6609 7.0 7812 0.4505 2971624
0.8257 8.0 8928 0.4511 3395952
0.8134 9.0 10044 0.4520 3821456
0.3343 10.0 11160 0.4507 4245816
0.1934 11.0 12276 0.4511 4670528
0.8753 12.0 13392 0.4468 5093960
0.6569 13.0 14508 0.4491 5519312
0.7334 14.0 15624 0.4489 5944728
0.526 15.0 16740 0.4508 6369088
0.638 16.0 17856 0.4495 6792536
0.5613 17.0 18972 0.4485 7217384
0.712 18.0 20088 0.4463 7642368
0.7204 19.0 21204 0.4488 8066544
0.2551 20.0 22320 0.4520 8491296

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_456_1760637800

Adapter
(2154)
this model

Evaluation results