train_openbookqa_42_1760637570

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5389
  • Num Input Tokens Seen: 8500696

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4076 1.0 1116 0.5516 425624
0.2729 2.0 2232 0.5427 851112
0.734 3.0 3348 0.5413 1276656
0.3725 4.0 4464 0.5444 1701464
1.0575 5.0 5580 0.5437 2126656
0.1351 6.0 6696 0.5425 2551992
0.3861 7.0 7812 0.5419 2977616
0.7384 8.0 8928 0.5389 3402128
0.8489 9.0 10044 0.5446 3826952
0.1875 10.0 11160 0.5424 4252632
0.3858 11.0 12276 0.5399 4677504
1.1837 12.0 13392 0.5398 5103104
0.812 13.0 14508 0.5428 5527832
0.2472 14.0 15624 0.5438 5952336
0.3142 15.0 16740 0.5403 6376760
0.3119 16.0 17856 0.5407 6801440
0.2963 17.0 18972 0.5415 7226000
0.582 18.0 20088 0.5455 7651232
0.2691 19.0 21204 0.5420 8076208
0.3204 20.0 22320 0.5420 8500696

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_42_1760637570

Adapter
(2187)
this model