train_qnli_42_1760637635

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.0410
Num Input Tokens Seen: 207226464

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.1162	1.0	23567	0.0688	10362048
0.0838	2.0	47134	0.0556	20723232
0.0663	3.0	70701	0.0491	31087712
0.0264	4.0	94268	0.0471	41448512
0.0811	5.0	117835	0.0440	51808576
0.0803	6.0	141402	0.0429	62164384
0.0482	7.0	164969	0.0425	72528320
0.0487	8.0	188536	0.0415	82892896
0.0224	9.0	212103	0.0413	93260448
0.008	10.0	235670	0.0414	103622208
0.0275	11.0	259237	0.0410	113983328
0.0068	12.0	282804	0.0411	124345088
0.0452	13.0	306371	0.0411	134702688
0.0603	14.0	329938	0.0419	145065792
0.0085	15.0	353505	0.0415	155429344
0.1323	16.0	377072	0.0418	165793952
0.0085	17.0	400639	0.0424	176154208
0.0279	18.0	424206	0.0419	186506176
0.0564	19.0	447773	0.0422	196865248
0.0627	20.0	471340	0.0421	207226464

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_qnli_42_1760637635

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2398)

this model