train_openbookqa_42_1767887008

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2301
  • Num Input Tokens Seen: 3981848

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0011 0.5002 1116 0.4504 200032
0.2827 1.0004 2232 0.3079 398392
0.2586 1.5007 3348 0.2610 597224
0.1303 2.0009 4464 0.2301 796792
0.0024 2.5011 5580 0.2492 996424
0.2453 3.0013 6696 0.2472 1195392
0.0013 3.5016 7812 0.2410 1394384
0.0017 4.0018 8928 0.2795 1593368
0.0002 4.5020 10044 0.3350 1791864
0.002 5.0022 11160 0.3309 1991152
0.0014 5.5025 12276 0.3969 2190512
0.0006 6.0027 13392 0.3788 2389592
0.0004 6.5029 14508 0.4245 2588584
0.0001 7.0031 15624 0.4287 2788464
0.0001 7.5034 16740 0.4777 2987520
0.1216 8.0036 17856 0.4650 3186608
0.4378 8.5038 18972 0.5155 3386160
0.0002 9.0040 20088 0.5158 3584616
0.0009 9.5043 21204 0.5068 3783624

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_42_1767887008

Adapter
(2394)
this model