train_math_qa_789_1760637952

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the math_qa dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0668
  • Num Input Tokens Seen: 77933776

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.3165 1.0 6714 1.1278 3898224
1.2448 2.0 13428 1.0910 7796616
1.1044 3.0 20142 1.0688 11688128
1.3373 4.0 26856 1.0679 15585640
1.2471 5.0 33570 1.0713 19481256
0.9889 6.0 40284 1.0731 23379928
0.9515 7.0 46998 1.0732 27274992
1.2691 8.0 53712 1.0714 31169464
0.9785 9.0 60426 1.0689 35061680
0.9645 10.0 67140 1.0702 38957336
0.9054 11.0 73854 1.0668 42854488
1.1303 12.0 80568 1.0681 46754376
1.2466 13.0 87282 1.0726 50647376
0.9298 14.0 93996 1.0756 54543272
0.6031 15.0 100710 1.0683 58447368
0.7423 16.0 107424 1.0692 62343120
1.2567 17.0 114138 1.0692 66240072
0.9746 18.0 120852 1.0692 70140040
0.8613 19.0 127566 1.0692 74035080
1.1456 20.0 134280 1.0692 77933776

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_math_qa_789_1760637952

Adapter
(2100)
this model

Evaluation results