train_openbookqa_42_1760637567

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6899
  • Num Input Tokens Seen: 8500696

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.736 1.0 1116 0.7030 425624
0.7591 2.0 2232 0.7114 851112
0.7271 3.0 3348 0.6941 1276656
0.6952 4.0 4464 0.6906 1701464
0.7029 5.0 5580 0.6903 2126656
0.7276 6.0 6696 0.6975 2551992
0.6915 7.0 7812 0.6939 2977616
0.7208 8.0 8928 0.6904 3402128
0.6811 9.0 10044 0.7024 3826952
0.7336 10.0 11160 0.7047 4252632
0.6846 11.0 12276 0.6924 4677504
0.6998 12.0 13392 0.7031 5103104
0.7029 13.0 14508 0.6899 5527832
0.7001 14.0 15624 0.6942 5952336
0.659 15.0 16740 0.6903 6376760
0.6999 16.0 17856 0.6912 6801440
0.6772 17.0 18972 0.6912 7226000
0.686 18.0 20088 0.6917 7651232
0.6879 19.0 21204 0.6906 8076208
0.6987 20.0 22320 0.6921 8500696

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_42_1760637567

Adapter
(2124)
this model

Evaluation results