train_svamp_1757340245

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2030
  • Num Input Tokens Seen: 1349424

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6612 1.0 315 0.6388 67504
0.2233 2.0 630 0.3123 135040
0.1283 3.0 945 0.1092 202528
0.1397 4.0 1260 0.1053 269840
0.0351 5.0 1575 0.1104 337408
0.0005 6.0 1890 0.1176 404880
0.0416 7.0 2205 0.2016 472240
0.0112 8.0 2520 0.1367 539744
0.0002 9.0 2835 0.1511 607456
0.0 10.0 3150 0.1816 674784
0.0001 11.0 3465 0.1978 742336
0.0 12.0 3780 0.1775 809792
0.0004 13.0 4095 0.1955 877248
0.0 14.0 4410 0.1969 944752
0.0 15.0 4725 0.1982 1012272
0.0 16.0 5040 0.2010 1079744
0.0 17.0 5355 0.2036 1147088
0.0 18.0 5670 0.2038 1214432
0.0 19.0 5985 0.2038 1282000
0.0 20.0 6300 0.2030 1349424

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_1757340245

Adapter
(2105)
this model

Evaluation results