train_svamp_1757340250

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2148
  • Num Input Tokens Seen: 704320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.3048 0.5 79 2.3766 35392
1.9747 1.0 158 1.8685 70288
1.4421 1.5 237 1.4855 105936
1.1118 2.0 316 1.1059 140896
0.7634 2.5 395 0.8010 175840
0.5605 3.0 474 0.5889 211504
0.3797 3.5 553 0.4483 246864
0.3022 4.0 632 0.3616 281664
0.2145 4.5 711 0.3088 317152
0.2936 5.0 790 0.2746 352048
0.279 5.5 869 0.2540 387600
0.1772 6.0 948 0.2424 422400
0.1456 6.5 1027 0.2319 457792
0.2089 7.0 1106 0.2250 492720
0.1536 7.5 1185 0.2211 528336
0.22 8.0 1264 0.2197 563312
0.1681 8.5 1343 0.2174 598800
0.2119 9.0 1422 0.2165 633968
0.163 9.5 1501 0.2148 669456
0.1187 10.0 1580 0.2152 704320

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_1757340250

Adapter
(2115)
this model

Evaluation results