train_openbookqa_42_1760637568

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 9.1137
  • Num Input Tokens Seen: 8500696

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7414 1.0 1116 0.6949 425624
0.7487 2.0 2232 0.6915 851112
0.7102 3.0 3348 0.6817 1276656
0.6306 4.0 4464 0.6128 1701464
0.6575 5.0 5580 0.6112 2126656
0.3895 6.0 6696 0.5720 2551992
0.4786 7.0 7812 0.5415 2977616
0.6486 8.0 8928 0.5206 3402128
0.5911 9.0 10044 0.5114 3826952
0.3184 10.0 11160 0.5518 4252632
0.4957 11.0 12276 0.5155 4677504
0.5967 12.0 13392 0.5680 5103104
0.4456 13.0 14508 0.5538 5527832
0.4545 14.0 15624 0.5796 5952336
0.3132 15.0 16740 0.5956 6376760
0.1878 16.0 17856 0.6835 6801440
0.4322 17.0 18972 0.7054 7226000
0.1899 18.0 20088 0.7452 7651232
0.1581 19.0 21204 0.7879 8076208
0.2388 20.0 22320 0.8024 8500696

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_42_1760637568

Adapter
(2187)
this model