train_openbookqa_789_1760637916

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2555
  • Num Input Tokens Seen: 8499784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7023 1.0 1116 0.4176 425448
0.2557 2.0 2232 0.3163 849792
0.1306 3.0 3348 0.2903 1274096
0.0469 4.0 4464 0.2786 1699224
0.1251 5.0 5580 0.2710 2123824
0.3616 6.0 6696 0.2592 2548008
0.1611 7.0 7812 0.2601 2972312
0.4689 8.0 8928 0.2564 3396920
0.2417 9.0 10044 0.2555 3822320
0.021 10.0 11160 0.2589 4247968
0.0424 11.0 12276 0.2561 4672584
0.0655 12.0 13392 0.2585 5097552
0.2312 13.0 14508 0.2586 5522440
0.444 14.0 15624 0.2593 5947656
0.096 15.0 16740 0.2602 6373520
0.3051 16.0 17856 0.2615 6798816
0.0113 17.0 18972 0.2627 7223992
0.1173 18.0 20088 0.2618 7649768
0.0727 19.0 21204 0.2590 8074496
0.0544 20.0 22320 0.2628 8499784

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_789_1760637916

Adapter
(2187)
this model