train_svamp_1757340274

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1795
  • Num Input Tokens Seen: 704272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.1046 0.5 79 2.0502 35296
1.1999 1.0 158 1.2046 70400
0.3511 1.5 237 0.4055 106208
0.3125 2.0 316 0.2548 140736
0.1117 2.5 395 0.2282 176064
0.1093 3.0 474 0.2107 211024
0.0729 3.5 553 0.2023 246128
0.1345 4.0 632 0.1966 281616
0.1695 4.5 711 0.1919 316976
0.089 5.0 790 0.1873 352256
0.0812 5.5 869 0.1845 387360
0.0597 6.0 948 0.1834 422464
0.0819 6.5 1027 0.1836 457760
0.0442 7.0 1106 0.1805 492912
0.045 7.5 1185 0.1818 528336
0.0458 8.0 1264 0.1803 563600
0.0676 8.5 1343 0.1799 598992
0.0822 9.0 1422 0.1799 633984
0.0459 9.5 1501 0.1795 669152
0.0407 10.0 1580 0.1805 704272

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_1757340274

Adapter
(2158)
this model

Evaluation results