train_openbookqa_123_1760637683

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6917
  • Num Input Tokens Seen: 8496984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7155 1.0 1116 0.7065 424840
0.7062 2.0 2232 0.6954 849600
0.714 3.0 3348 0.6993 1274392
0.693 4.0 4464 0.6938 1699872
0.705 5.0 5580 0.6962 2125480
0.6951 6.0 6696 0.6937 2551008
0.6938 7.0 7812 0.6934 2975440
0.7005 8.0 8928 0.6936 3400064
0.7089 9.0 10044 0.6948 3825032
0.6879 10.0 11160 0.6942 4249648
0.6751 11.0 12276 0.6928 4673640
0.6862 12.0 13392 0.6945 5097992
0.6936 13.0 14508 0.6917 5522784
0.6822 14.0 15624 0.6940 5948008
0.6823 15.0 16740 0.6946 6372656
0.6923 16.0 17856 0.6930 6797720
0.6892 17.0 18972 0.6937 7222896
0.6913 18.0 20088 0.6924 7646872
0.6989 19.0 21204 0.6943 8071488
0.6913 20.0 22320 0.6944 8496984

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_123_1760637683

Adapter
(2185)
this model