train_math_qa_42_1767887015

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the math_qa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6802
  • Num Input Tokens Seen: 35942048

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7755 0.5000 6714 0.7232 1799152
0.6544 1.0001 13428 0.6976 3596344
0.7932 1.5001 20142 0.7114 5396616
0.8886 2.0001 26856 0.6802 7190704
0.4694 2.5002 33570 0.7063 8991424
0.401 3.0002 40284 0.6956 10785144
0.8138 3.5003 46998 0.7084 12581096
0.6494 4.0003 53712 0.6876 14382064
0.8883 4.5003 60426 0.7328 16175088
0.4923 5.0004 67140 0.7217 17974672
0.2501 5.5004 73854 0.8163 19773264
0.7885 6.0004 80568 0.7569 21568248
0.7258 6.5005 87282 0.7879 23365128
0.5667 7.0005 93996 0.7922 25163976
0.4671 7.5006 100710 0.8532 26966472
0.1168 8.0006 107424 0.8266 28755776
0.632 8.5006 114138 0.8332 30555744
0.5144 9.0007 120852 0.8524 32349952
0.7312 9.5007 127566 0.8587 34145424

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_math_qa_42_1767887015

Adapter
(2389)
this model