train_qnli_456_1760637865

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.0426
Num Input Tokens Seen: 207225024

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.0576	1.0	23567	0.0697	10354304
0.0747	2.0	47134	0.0561	20707072
0.0424	3.0	70701	0.0504	31068416
0.0742	4.0	94268	0.0481	41429120
0.0215	5.0	117835	0.0453	51792992
0.0524	6.0	141402	0.0453	62154656
0.008	7.0	164969	0.0442	72517024
0.011	8.0	188536	0.0434	82880000
0.0332	9.0	212103	0.0426	93239936
0.0363	10.0	235670	0.0427	103606752
0.0564	11.0	259237	0.0428	113970336
0.0182	12.0	282804	0.0432	124330144
0.0146	13.0	306371	0.0426	134690080
0.0559	14.0	329938	0.0433	145051648
0.0092	15.0	353505	0.0435	155411232
0.0075	16.0	377072	0.0430	165771456
0.02	17.0	400639	0.0439	176136224
0.0253	18.0	424206	0.0435	186502496
0.0594	19.0	447773	0.0436	196862944
0.0157	20.0	471340	0.0438	207225024

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_qnli_456_1760637865

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2399)

this model