train_mrpc_42_1760637562

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the mrpc dataset. It achieves the following results on the evaluation set:

Loss: 0.4168
Num Input Tokens Seen: 6769320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.1409	1.0	826	0.1507	337344
0.1145	2.0	1652	0.1209	675368
0.1094	3.0	2478	0.1225	1014008
0.1364	4.0	3304	0.1022	1353224
0.1092	5.0	4130	0.0886	1692320
0.0769	6.0	4956	0.0901	2030488
0.0402	7.0	5782	0.1248	2368032
0.1207	8.0	6608	0.0971	2706552
0.0485	9.0	7434	0.0971	3044984
0.0887	10.0	8260	0.1101	3384288
0.0091	11.0	9086	0.1175	3722552
0.0319	12.0	9912	0.1249	4061216
0.0066	13.0	10738	0.1735	4399064
0.0147	14.0	11564	0.2274	4737648
0.0008	15.0	12390	0.2575	5075144
0.0002	16.0	13216	0.2804	5414032
0.0002	17.0	14042	0.3063	5752528
0.0002	18.0	14868	0.3093	6091472
0.0002	19.0	15694	0.3139	6430144
0.0002	20.0	16520	0.3151	6769320

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_mrpc_42_1760637562

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2133)

this model

rbelanec
/

train_mrpc_42_1760637562

train_mrpc_42_1760637562

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_mrpc_42_1760637562

Evaluation results