train_svamp_456_1760637773

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1674
  • Num Input Tokens Seen: 1432752

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1886 1.0 158 0.1601 71728
0.0963 2.0 316 0.0951 143392
0.0606 3.0 474 0.1081 214928
0.0531 4.0 632 0.0753 286624
0.048 5.0 790 0.0818 358096
0.0354 6.0 948 0.0833 429680
0.0218 7.0 1106 0.0843 501168
0.0228 8.0 1264 0.0884 573152
0.0277 9.0 1422 0.0969 644672
0.0074 10.0 1580 0.1022 716272
0.0143 11.0 1738 0.1302 787952
0.0118 12.0 1896 0.1231 859584
0.0033 13.0 2054 0.1430 931280
0.0032 14.0 2212 0.1486 1002960
0.0003 15.0 2370 0.1587 1074528
0.0011 16.0 2528 0.1645 1146144
0.0005 17.0 2686 0.1668 1217760
0.0007 18.0 2844 0.1684 1289504
0.0005 19.0 3002 0.1684 1361040
0.0001 20.0 3160 0.1691 1432752

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_456_1760637773

Adapter
(2188)
this model