train_mrpc_42_1760637564

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2464
  • Num Input Tokens Seen: 6769320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2013 1.0 826 0.2597 337344
0.2932 2.0 1652 0.2618 675368
0.1325 3.0 2478 0.2486 1014008
0.252 4.0 3304 0.2479 1353224
0.2865 5.0 4130 0.2481 1692320
0.1921 6.0 4956 0.2472 2030488
0.1717 7.0 5782 0.2499 2368032
0.3535 8.0 6608 0.2467 2706552
0.2347 9.0 7434 0.2485 3044984
0.1887 10.0 8260 0.2487 3384288
0.2093 11.0 9086 0.2496 3722552
0.1651 12.0 9912 0.2478 4061216
0.1871 13.0 10738 0.2486 4399064
0.129 14.0 11564 0.2512 4737648
0.2764 15.0 12390 0.2492 5075144
0.226 16.0 13216 0.2471 5414032
0.1859 17.0 14042 0.2486 5752528
0.2208 18.0 14868 0.2491 6091472
0.271 19.0 15694 0.2464 6430144
0.1882 20.0 16520 0.2464 6769320

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_42_1760637564

Adapter
(2405)
this model