train_svamp_1757340201

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1597
  • Num Input Tokens Seen: 705184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.3995 0.5 79 2.2946 35776
1.9914 1.0 158 1.7998 70672
1.4951 1.5 237 1.4120 105904
1.005 2.0 316 1.0429 141328
0.7404 2.5 395 0.7376 176752
0.557 3.0 474 0.5306 211808
0.4419 3.5 553 0.3952 247104
0.3074 4.0 632 0.3051 282048
0.2277 4.5 711 0.2493 317248
0.2723 5.0 790 0.2192 352592
0.2465 5.5 869 0.1984 388176
0.2696 6.0 948 0.1844 423184
0.2411 6.5 1027 0.1750 458640
0.2105 7.0 1106 0.1691 493440
0.3454 7.5 1185 0.1649 528768
0.188 8.0 1264 0.1626 563872
0.114 8.5 1343 0.1605 599232
0.2467 9.0 1422 0.1597 634544
0.225 9.5 1501 0.1608 670064
0.0899 10.0 1580 0.1605 705184

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_1757340201

Adapter
(2124)
this model

Evaluation results