train_openbookqa_101112_1760638026

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5633
  • Num Input Tokens Seen: 8474968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6823 1.0 1116 0.6974 424568
0.6892 2.0 2232 0.6952 848552
0.6966 3.0 3348 0.6827 1271776
0.5206 4.0 4464 0.5995 1694896
0.7599 5.0 5580 0.5701 2118456
0.6584 6.0 6696 0.5472 2542752
0.5229 7.0 7812 0.5611 2966512
0.3532 8.0 8928 0.5250 3390048
0.4081 9.0 10044 0.5390 3814704
0.3609 10.0 11160 0.5244 4238440
0.4103 11.0 12276 0.5373 4662136
0.4956 12.0 13392 0.5254 5086336
0.3762 13.0 14508 0.5417 5510768
0.4191 14.0 15624 0.5420 5933936
0.3506 15.0 16740 0.5722 6357536
0.431 16.0 17856 0.5834 6779872
0.1577 17.0 18972 0.6031 7203216
0.2304 18.0 20088 0.6346 7626944
0.2465 19.0 21204 0.6501 8051216
0.3756 20.0 22320 0.6509 8474968

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_101112_1760638026

Adapter
(2397)
this model