train_svamp_456_1760637775

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3574
  • Num Input Tokens Seen: 1432752

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.3114 1.0 158 2.3960 71728
2.3629 2.0 316 2.3916 143392
2.444 3.0 474 2.3753 214928
2.2586 4.0 632 2.3644 286624
2.4396 5.0 790 2.3658 358096
2.2935 6.0 948 2.3612 429680
2.2293 7.0 1106 2.3599 501168
2.3159 8.0 1264 2.3636 573152
2.3399 9.0 1422 2.3619 644672
2.3443 10.0 1580 2.3611 716272
2.3384 11.0 1738 2.3580 787952
2.4792 12.0 1896 2.3615 859584
2.3733 13.0 2054 2.3607 931280
2.3321 14.0 2212 2.3581 1002960
2.2736 15.0 2370 2.3597 1074528
2.4557 16.0 2528 2.3618 1146144
2.2923 17.0 2686 2.3616 1217760
2.3658 18.0 2844 2.3588 1289504
2.3829 19.0 3002 2.3574 1361040
2.3663 20.0 3160 2.3574 1432752

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_456_1760637775

Adapter
(2187)
this model