train_svamp_456_1760637777

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1096
  • Num Input Tokens Seen: 1432752

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.3123 1.0 158 2.3994 71728
1.5411 2.0 316 1.5345 143392
0.8907 3.0 474 0.8320 214928
0.3528 4.0 632 0.4215 286624
0.3593 5.0 790 0.2530 358096
0.1818 6.0 948 0.1880 429680
0.1685 7.0 1106 0.1586 501168
0.1126 8.0 1264 0.1422 573152
0.0887 9.0 1422 0.1328 644672
0.0529 10.0 1580 0.1258 716272
0.1042 11.0 1738 0.1216 787952
0.1087 12.0 1896 0.1192 859584
0.1096 13.0 2054 0.1160 931280
0.1042 14.0 2212 0.1131 1002960
0.1582 15.0 2370 0.1129 1074528
0.1888 16.0 2528 0.1114 1146144
0.0612 17.0 2686 0.1116 1217760
0.1321 18.0 2844 0.1116 1289504
0.0705 19.0 3002 0.1096 1361040
0.0414 20.0 3160 0.1112 1432752

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_456_1760637777

Adapter
(2125)
this model

Evaluation results