train_mrpc_456_1760637794

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

Loss: 0.2397
Num Input Tokens Seen: 6773216

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2922	1.0	826	0.2575	338864
0.3211	2.0	1652	0.2506	676984
0.2783	3.0	2478	0.2468	1016176
0.3246	4.0	3304	0.2432	1354632
0.2403	5.0	4130	0.2436	1692816
0.2129	6.0	4956	0.2436	2031320
0.2544	7.0	5782	0.2449	2369768
0.2679	8.0	6608	0.2423	2708688
0.2565	9.0	7434	0.2444	3047376
0.3842	10.0	8260	0.2452	3386408
0.1608	11.0	9086	0.2441	3724296
0.3191	12.0	9912	0.2411	4063352
0.3663	13.0	10738	0.2461	4402032
0.187	14.0	11564	0.2440	4740464
0.2954	15.0	12390	0.2439	5079384
0.1936	16.0	13216	0.2397	5418192
0.1516	17.0	14042	0.2417	5757208
0.1917	18.0	14868	0.2424	6095648
0.3371	19.0	15694	0.2415	6434448
0.2294	20.0	16520	0.2415	6773216

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_mrpc_456_1760637794

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2392)

this model