train_qnli_123_1760637752

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.0416
Num Input Tokens Seen: 207208704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.0432	1.0	23567	0.0683	10365216
0.1686	2.0	47134	0.0558	20725024
0.0273	3.0	70701	0.0521	31080960
0.0322	4.0	94268	0.0458	41439424
0.0126	5.0	117835	0.0449	51801184
0.0518	6.0	141402	0.0436	62164704
0.0171	7.0	164969	0.0437	72529184
0.0427	8.0	188536	0.0420	82884480
0.0159	9.0	212103	0.0425	93243840
0.0222	10.0	235670	0.0416	103607072
0.0145	11.0	259237	0.0419	113965760
0.0359	12.0	282804	0.0427	124331968
0.0288	13.0	306371	0.0434	134696864
0.0281	14.0	329938	0.0434	145056992
0.0078	15.0	353505	0.0431	155415232
0.0767	16.0	377072	0.0441	165776960
0.0242	17.0	400639	0.0432	176136864
0.0159	18.0	424206	0.0439	186488608
0.0114	19.0	447773	0.0436	196849632
0.049	20.0	471340	0.0436	207208704

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 5

Model tree for rbelanec/train_qnli_123_1760637752

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2394)

this model