train_mrpc_456_1760637791

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1311
  • Num Input Tokens Seen: 6773216

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2465 1.0 826 0.2048 338864
0.1927 2.0 1652 0.1857 676984
0.1895 3.0 2478 0.1846 1016176
0.25 4.0 3304 0.1879 1354632
0.1154 5.0 4130 0.1810 1692816
0.1836 6.0 4956 0.1711 2031320
0.1226 7.0 5782 0.1825 2369768
0.1663 8.0 6608 0.1495 2708688
0.0502 9.0 7434 0.1644 3047376
0.1111 10.0 8260 0.1470 3386408
0.0932 11.0 9086 0.1311 3724296
0.1617 12.0 9912 0.1385 4063352
0.0904 13.0 10738 0.1451 4402032
0.0115 14.0 11564 0.1562 4740464
0.0464 15.0 12390 0.1838 5079384
0.0072 16.0 13216 0.2151 5418192
0.0094 17.0 14042 0.2423 5757208
0.01 18.0 14868 0.2462 6095648
0.0102 19.0 15694 0.2460 6434448
0.008 20.0 16520 0.2470 6773216

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_456_1760637791

Adapter
(2105)
this model

Evaluation results