train_mrpc_42_1767887007

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1703
  • Num Input Tokens Seen: 3176720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1063 0.5003 826 0.1869 159184
0.2039 1.0006 1652 0.2682 317472
0.2737 1.5009 2478 0.2038 476064
0.31 2.0012 3304 0.1862 635320
0.0298 2.5015 4130 0.1837 794168
1.0274 3.0018 4956 0.2195 953048
0.315 3.5021 5782 0.1703 1110936
0.1443 4.0024 6608 0.2054 1270600
0.1766 4.5027 7434 0.1833 1428808
0.0008 5.0030 8260 0.2010 1588656
0.237 5.5033 9086 0.2213 1748592
0.0343 6.0036 9912 0.2101 1906808
0.0314 6.5039 10738 0.2135 2065864
0.0067 7.0042 11564 0.2120 2224048
0.211 7.5045 12390 0.2255 2382896
0.1316 8.0048 13216 0.2202 2542008
0.1317 8.5051 14042 0.2261 2701288
0.4379 9.0055 14868 0.2267 2860216
0.0027 9.5058 15694 0.2267 3020392

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_42_1767887007

Adapter
(2393)
this model