train_openbookqa_456_1760637797

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6920
  • Num Input Tokens Seen: 8491296

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6973 1.0 1116 0.6959 424016
0.6939 2.0 2232 0.6995 848912
0.6787 3.0 3348 0.6933 1273224
0.694 4.0 4464 0.6934 1698344
0.6986 5.0 5580 0.6943 2122168
0.6934 6.0 6696 0.6953 2547696
0.7094 7.0 7812 0.6929 2971624
0.6981 8.0 8928 0.6943 3395952
0.6968 9.0 10044 0.6935 3821456
0.6939 10.0 11160 0.6920 4245816
0.681 11.0 12276 0.6927 4670528
0.6882 12.0 13392 0.6929 5093960
0.6956 13.0 14508 0.6947 5519312
0.6973 14.0 15624 0.6932 5944728
0.6869 15.0 16740 0.6933 6369088
0.6871 16.0 17856 0.6949 6792536
0.695 17.0 18972 0.6943 7217384
0.6958 18.0 20088 0.6935 7642368
0.6785 19.0 21204 0.6945 8066544
0.6982 20.0 22320 0.6932 8491296

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_456_1760637797

Adapter
(2105)
this model

Evaluation results