train_qnli_42_1760637632

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.5953
Num Input Tokens Seen: 207226464

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.1185	1.0	23567	0.0459	10362048
0.0601	2.0	47134	0.0417	20723232
0.0585	3.0	70701	0.0388	31087712
0.0254	4.0	94268	0.0391	41448512
0.0693	5.0	117835	0.0396	51808576
0.086	6.0	141402	0.0377	62164384
0.0488	7.0	164969	0.0382	72528320
0.0234	8.0	188536	0.0382	82892896
0.0076	9.0	212103	0.0406	93260448
0.0057	10.0	235670	0.0401	103622208
0.0103	11.0	259237	0.0426	113983328
0.0284	12.0	282804	0.0462	124345088
0.0095	13.0	306371	0.0479	134702688
0.0028	14.0	329938	0.0514	145065792
0.0081	15.0	353505	0.0545	155429344
0.0849	16.0	377072	0.0630	165793952
0.0121	17.0	400639	0.0662	176154208
0.0033	18.0	424206	0.0705	186506176
0.0017	19.0	447773	0.0726	196865248
0.0012	20.0	471340	0.0735	207226464

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_qnli_42_1760637632

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2398)

this model