train_mrpc_101112_1760638019

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0895
  • Num Input Tokens Seen: 6767120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1908 1.0 826 0.1935 339144
0.134 2.0 1652 0.1853 677000
0.1533 3.0 2478 0.1576 1015816
0.1368 4.0 3304 0.1423 1354072
0.1276 5.0 4130 0.1720 1692608
0.1424 6.0 4956 0.1549 2030416
0.0926 7.0 5782 0.1291 2369136
0.0879 8.0 6608 0.1318 2707920
0.117 9.0 7434 0.1414 3046392
0.0643 10.0 8260 0.1400 3383592
0.0849 11.0 9086 0.1440 3722064
0.1276 12.0 9912 0.1365 4060512
0.0126 13.0 10738 0.1848 4398224
0.0166 14.0 11564 0.1805 4737184
0.0226 15.0 12390 0.2113 5075136
0.004 16.0 13216 0.2952 5413384
0.0033 17.0 14042 0.3230 5751600
0.0018 18.0 14868 0.3796 6090384
0.0013 19.0 15694 0.3771 6429416
0.0005 20.0 16520 0.3815 6767120

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_101112_1760638019

Adapter
(2105)
this model

Evaluation results