train_qnli_1755619685

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.1737
Num Input Tokens Seen: 94426336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 123
distributed_type: multi-GPU
num_devices: 2
total_train_batch_size: 4
total_eval_batch_size: 4
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.1443	0.5000	11784	0.1060	4726400
0.0196	1.0000	23568	0.0572	9444272
0.1075	1.5001	35352	0.0433	14172240
0.011	2.0001	47136	0.0372	18886368
0.0451	2.5001	58920	0.0397	23595456
0.0107	3.0001	70704	0.0555	28323552
0.0558	3.5001	82488	0.0385	33045520
0.0713	4.0002	94272	0.0360	37766912
0.0149	4.5002	106056	0.0393	42486256
0.0617	5.0002	117840	0.0394	47210640
0.0039	5.5002	129624	0.0463	51929904
0.0071	6.0003	141408	0.0468	56656208
0.0586	6.5003	153192	0.0671	61382064
0.0014	7.0003	164976	0.0653	66103104
0.0001	7.5003	176760	0.0901	70824800
0.0003	8.0003	188544	0.0864	75545952
0.0	8.5004	200328	0.1408	80268160
0.0	9.0004	212112	0.1439	84990112
0.0	9.5004	223896	0.1708	89707216

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_qnli_1755619685

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2399)

this model