train_qnli_456_1760637864

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.1841
Num Input Tokens Seen: 207225024

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.0837	1.0	23567	0.1946	10354304
0.2725	2.0	47134	0.1870	20707072
0.1215	3.0	70701	0.1845	31068416
0.1873	4.0	94268	0.1853	41429120
0.3118	5.0	117835	0.1860	51792992
0.3401	6.0	141402	0.1851	62154656
0.2494	7.0	164969	0.1841	72517024
0.0454	8.0	188536	0.1853	82880000
0.3943	9.0	212103	0.1861	93239936
0.0949	10.0	235670	0.1853	103606752
0.4075	11.0	259237	0.1851	113970336
0.1302	12.0	282804	0.1859	124330144
0.1467	13.0	306371	0.1849	134690080
0.4289	14.0	329938	0.1857	145051648
0.0465	15.0	353505	0.1857	155411232
0.0843	16.0	377072	0.1857	165771456
0.0639	17.0	400639	0.1857	176136224
0.0955	18.0	424206	0.1857	186502496
0.2112	19.0	447773	0.1857	196862944
0.1812	20.0	471340	0.1857	207225024

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 2

Model tree for rbelanec/train_qnli_456_1760637864

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2397)

this model