train_mrpc_456_1760637792

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2346
  • Num Input Tokens Seen: 6773216

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2321 1.0 826 0.2240 338864
0.2171 2.0 1652 0.1761 676984
0.1427 3.0 2478 0.1641 1016176
0.2379 4.0 3304 0.1589 1354632
0.1037 5.0 4130 0.1515 1692816
0.1423 6.0 4956 0.1513 2031320
0.0681 7.0 5782 0.1670 2369768
0.108 8.0 6608 0.1435 2708688
0.0179 9.0 7434 0.1553 3047376
0.0706 10.0 8260 0.1431 3386408
0.0573 11.0 9086 0.1348 3724296
0.1185 12.0 9912 0.1581 4063352
0.0091 13.0 10738 0.1886 4402032
0.0026 14.0 11564 0.2011 4740464
0.0182 15.0 12390 0.2893 5079384
0.0029 16.0 13216 0.3326 5418192
0.0006 17.0 14042 0.3731 5757208
0.0005 18.0 14868 0.3943 6095648
0.0208 19.0 15694 0.4036 6434448
0.0008 20.0 16520 0.4064 6773216

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_456_1760637792

Adapter
(2100)
this model

Evaluation results