train_svamp_1757340246

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1158
  • Num Input Tokens Seen: 704320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.8634 0.5 79 1.8011 35392
0.0765 1.0 158 0.1944 70288
0.1491 1.5 237 0.1603 105936
0.0712 2.0 316 0.1453 140896
0.1168 2.5 395 0.1349 175840
0.0705 3.0 474 0.1190 211504
0.0402 3.5 553 0.1217 246864
0.0525 4.0 632 0.1158 281664
0.0505 4.5 711 0.1164 317152
0.0922 5.0 790 0.1252 352048
0.1237 5.5 869 0.1178 387600
0.0457 6.0 948 0.1160 422400
0.0258 6.5 1027 0.1174 457792
0.0827 7.0 1106 0.1175 492720
0.0349 7.5 1185 0.1170 528336
0.0764 8.0 1264 0.1172 563312
0.0595 8.5 1343 0.1183 598800
0.0781 9.0 1422 0.1178 633968
0.0322 9.5 1501 0.1182 669456
0.0147 10.0 1580 0.1187 704320

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_1757340246

Adapter
(2133)
this model

Evaluation results