train_qnli_789_1760637975

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.4413
Num Input Tokens Seen: 207033472

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.0551	1.0	23567	0.0477	10356768
0.0111	2.0	47134	0.0422	20704704
0.04	3.0	70701	0.0401	31058016
0.013	4.0	94268	0.0380	41403776
0.0577	5.0	117835	0.0366	51759136
0.1092	6.0	141402	0.0370	62112992
0.0163	7.0	164969	0.0374	72463360
0.0036	8.0	188536	0.0378	82820352
0.0397	9.0	212103	0.0407	93170016
0.072	10.0	235670	0.0402	103520672
0.0201	11.0	259237	0.0460	113880864
0.0032	12.0	282804	0.0500	124232512
0.0032	13.0	306371	0.0522	134580160
0.0022	14.0	329938	0.0634	144930784
0.0006	15.0	353505	0.0658	155287840
0.0028	16.0	377072	0.0737	165642176
0.0037	17.0	400639	0.0839	175984352
0.0004	18.0	424206	0.0933	186335840
0.0005	19.0	447773	0.0964	196686912
0.0555	20.0	471340	0.0982	207033472

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_qnli_789_1760637975

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2401)

this model