train_svamp_42_1760637546

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1071
  • Num Input Tokens Seen: 1433520

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.2836 1.0 158 2.3716 71568
1.5155 2.0 316 1.5045 143232
0.8273 3.0 474 0.8055 214912
0.5098 4.0 632 0.3984 286448
0.1903 5.0 790 0.2359 358176
0.1498 6.0 948 0.1774 429728
0.1652 7.0 1106 0.1501 501504
0.2466 8.0 1264 0.1360 573120
0.0927 9.0 1422 0.1293 644944
0.1314 10.0 1580 0.1215 716448
0.1112 11.0 1738 0.1176 788256
0.1019 12.0 1896 0.1143 859808
0.2181 13.0 2054 0.1121 931472
0.0853 14.0 2212 0.1106 1003376
0.1288 15.0 2370 0.1095 1075088
0.0997 16.0 2528 0.1091 1146608
0.1128 17.0 2686 0.1073 1218368
0.0827 18.0 2844 0.1076 1290144
0.1023 19.0 3002 0.1075 1361984
0.09 20.0 3160 0.1071 1433520

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_42_1760637546

Adapter
(2107)
this model

Evaluation results