train_qnli_123_1760637750

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.1815
Num Input Tokens Seen: 207208704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2404	1.0	23567	0.1929	10365216
0.3377	2.0	47134	0.1857	20725024
0.0609	3.0	70701	0.1839	31080960
0.0233	4.0	94268	0.1826	41439424
0.2244	5.0	117835	0.1818	51801184
0.3223	6.0	141402	0.1823	62164704
0.0422	7.0	164969	0.1815	72529184
0.426	8.0	188536	0.1835	82884480
0.2636	9.0	212103	0.1837	93243840
0.2069	10.0	235670	0.1830	103607072
0.2185	11.0	259237	0.1831	113965760
0.0189	12.0	282804	0.1828	124331968
0.2307	13.0	306371	0.1837	134696864
0.2197	14.0	329938	0.1828	145056992
0.0385	15.0	353505	0.1831	155415232
0.2833	16.0	377072	0.1831	165776960
0.2189	17.0	400639	0.1831	176136864
0.0914	18.0	424206	0.1831	186488608
0.0988	19.0	447773	0.1831	196849632
0.1111	20.0	471340	0.1831	207208704

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_qnli_123_1760637750

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2403)

this model