train_qnli_123_1760637747

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.0373
Num Input Tokens Seen: 207208704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.0291	1.0	23567	0.0420	10365216
0.1374	2.0	47134	0.0393	20725024
0.0229	3.0	70701	0.0442	31080960
0.0314	4.0	94268	0.0377	41439424
0.0242	5.0	117835	0.0379	51801184
0.0929	6.0	141402	0.0405	62164704
0.0242	7.0	164969	0.0374	72529184
0.0381	8.0	188536	0.0373	82884480
0.0195	9.0	212103	0.0374	93243840
0.0154	10.0	235670	0.0376	103607072
0.0237	11.0	259237	0.0379	113965760
0.021	12.0	282804	0.0394	124331968
0.041	13.0	306371	0.0392	134696864
0.0374	14.0	329938	0.0397	145056992
0.0052	15.0	353505	0.0396	155415232
0.0184	16.0	377072	0.0398	165776960
0.0206	17.0	400639	0.0396	176136864
0.0135	18.0	424206	0.0396	186488608
0.0109	19.0	447773	0.0396	196849632
0.0416	20.0	471340	0.0396	207208704

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_qnli_123_1760637747

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2401)

this model