train_svamp_456_1760637772

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0590
  • Num Input Tokens Seen: 1432752

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1892 1.0 158 0.1561 71728
0.083 2.0 316 0.0806 143392
0.0538 3.0 474 0.0792 214928
0.0465 4.0 632 0.0590 286624
0.0275 5.0 790 0.0763 358096
0.0382 6.0 948 0.0829 429680
0.0099 7.0 1106 0.0701 501168
0.0219 8.0 1264 0.0721 573152
0.0026 9.0 1422 0.0810 644672
0.0012 10.0 1580 0.0815 716272
0.0119 11.0 1738 0.0882 787952
0.0023 12.0 1896 0.0878 859584
0.0015 13.0 2054 0.0915 931280
0.0009 14.0 2212 0.0922 1002960
0.0007 15.0 2370 0.0926 1074528
0.0008 16.0 2528 0.0931 1146144
0.0007 17.0 2686 0.0953 1217760
0.0026 18.0 2844 0.0942 1289504
0.001 19.0 3002 0.0939 1361040
0.0005 20.0 3160 0.0941 1432752

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_456_1760637772

Adapter
(2133)
this model

Evaluation results