train_mrpc_123_1760637678

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1104
  • Num Input Tokens Seen: 6774288

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1032 1.0 826 0.1915 339568
0.0404 2.0 1652 0.1104 678688
0.0144 3.0 2478 0.1295 1017368
0.11 4.0 3304 0.1386 1356744
0.0622 5.0 4130 0.1903 1694912
0.0012 6.0 4956 0.2808 2033992
0.0001 7.0 5782 0.3320 2372464
0.0563 8.0 6608 0.3093 2710624
0.0001 9.0 7434 0.3586 3049392
0.0 10.0 8260 0.3543 3388032
0.0004 11.0 9086 0.5125 3727312
0.0 12.0 9912 0.3661 4065504
0.0002 13.0 10738 0.4454 4404624
0.1018 14.0 11564 0.3821 4743080
0.0 15.0 12390 0.4532 5082240
0.0 16.0 13216 0.4530 5420840
0.0 17.0 14042 0.4589 5759384
0.0 18.0 14868 0.4664 6097784
0.0 19.0 15694 0.4699 6435544
0.0 20.0 16520 0.4702 6774288

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_123_1760637678

Adapter
(2105)
this model

Evaluation results