train_svamp_1757340174

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7598
  • Num Input Tokens Seen: 704336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2188 0.5 79 0.2228 35680
0.159 1.0 158 0.1447 70512
0.0537 1.5 237 0.0907 105904
0.0444 2.0 316 0.0732 140960
0.0265 2.5 395 0.0597 176096
0.0588 3.0 474 0.0751 211424
0.0143 3.5 553 0.0512 246784
0.1025 4.0 632 0.0555 281968
0.0381 4.5 711 0.0541 317232
0.0096 5.0 790 0.0656 352368
0.0442 5.5 869 0.0483 387824
0.0288 6.0 948 0.0446 422704
0.002 6.5 1027 0.0500 457744
0.005 7.0 1106 0.0583 493200
0.0016 7.5 1185 0.0564 528304
0.0351 8.0 1264 0.0584 563520
0.0252 8.5 1343 0.0623 599072
0.0032 9.0 1422 0.0627 634176
0.0402 9.5 1501 0.0643 669440
0.001 10.0 1580 0.0661 704336

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_1757340174

Adapter
(2133)
this model

Evaluation results