train_mrpc_456_1760637795

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1376
  • Num Input Tokens Seen: 6773216

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2116 1.0 826 0.1892 338864
0.2104 2.0 1652 0.1653 676984
0.1469 3.0 2478 0.1562 1016176
0.2344 4.0 3304 0.1572 1354632
0.1149 5.0 4130 0.1477 1692816
0.0964 6.0 4956 0.1442 2031320
0.1655 7.0 5782 0.1421 2369768
0.1812 8.0 6608 0.1400 2708688
0.071 9.0 7434 0.1403 3047376
0.2013 10.0 8260 0.1443 3386408
0.108 11.0 9086 0.1416 3724296
0.1691 12.0 9912 0.1376 4063352
0.1233 13.0 10738 0.1437 4402032
0.0452 14.0 11564 0.1417 4740464
0.1797 15.0 12390 0.1423 5079384
0.086 16.0 13216 0.1424 5418192
0.0743 17.0 14042 0.1386 5757208
0.1652 18.0 14868 0.1406 6095648
0.1607 19.0 15694 0.1428 6434448
0.0791 20.0 16520 0.1405 6773216

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_456_1760637795

Adapter
(2100)
this model

Evaluation results