train_mrpc_789_1760637905

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1123
  • Num Input Tokens Seen: 6772448

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1766 1.0 826 0.1895 338792
0.1337 2.0 1652 0.1952 677792
0.1921 3.0 2478 0.1855 1016968
0.2368 4.0 3304 0.1845 1355576
0.1671 5.0 4130 0.1697 1694240
0.22 6.0 4956 0.1718 2032216
0.1148 7.0 5782 0.1442 2370752
0.2036 8.0 6608 0.1424 2708952
0.1358 9.0 7434 0.1230 3047656
0.0751 10.0 8260 0.1168 3386008
0.1282 11.0 9086 0.1139 3724536
0.1584 12.0 9912 0.1241 4063536
0.0377 13.0 10738 0.1123 4402704
0.0462 14.0 11564 0.1237 4740960
0.0099 15.0 12390 0.1264 5078920
0.0484 16.0 13216 0.1548 5417832
0.0032 17.0 14042 0.1888 5755672
0.0076 18.0 14868 0.2001 6094000
0.0093 19.0 15694 0.2003 6433272
0.0049 20.0 16520 0.2022 6772448

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_789_1760637905

Adapter
(2185)
this model