train_mrpc_1752763925

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2269
  • Num Input Tokens Seen: 3558824

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
8.1917 0.5012 207 8.0704 180032
4.6243 1.0024 414 4.4040 357432
2.9633 1.5036 621 2.4093 536632
1.2963 2.0048 828 1.4657 714864
1.1611 2.5061 1035 0.9808 892784
0.5194 3.0073 1242 0.6443 1072016
0.466 3.5085 1449 0.4323 1251408
0.2929 4.0097 1656 0.3389 1429312
0.2554 4.5109 1863 0.2955 1607488
0.2344 5.0121 2070 0.2714 1785760
0.2631 5.5133 2277 0.2556 1963360
0.2409 6.0145 2484 0.2481 2142632
0.2395 6.5157 2691 0.2385 2320552
0.2207 7.0169 2898 0.2330 2498696
0.2638 7.5182 3105 0.2312 2676552
0.2183 8.0194 3312 0.2285 2854776
0.2364 8.5206 3519 0.2278 3032440
0.238 9.0218 3726 0.2284 3211280
0.1911 9.5230 3933 0.2269 3389008

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_mrpc_1752763925

Adapter
(2159)
this model

Evaluation results