train_boolq_42_1760776552

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

Loss: 0.1341
Num Input Tokens Seen: 42773120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2755	1.0	2121	0.1668	2135488
0.1282	2.0	4242	0.1450	4271424
0.0642	3.0	6363	0.1341	6407520
0.4773	4.0	8484	0.1390	8553728
0.0329	5.0	10605	0.1406	10692704
0.0232	6.0	12726	0.1471	12829472
0.0106	7.0	14847	0.1628	14967104
0.1007	8.0	16968	0.1669	17105760
0.0025	9.0	19089	0.2028	19246048
0.0328	10.0	21210	0.2104	21382880
0.0022	11.0	23331	0.2429	23522528
0.0044	12.0	25452	0.2581	25662176
0.0016	13.0	27573	0.2896	27797760
0.0016	14.0	29694	0.3251	29933184
0.0006	15.0	31815	0.3419	32075552
0.0006	16.0	33936	0.3597	34216384
0.0009	17.0	36057	0.3611	36358080
0.0005	18.0	38178	0.3675	38498720
0.0946	19.0	40299	0.3681	40635680
0.0015	20.0	42420	0.3714	42773120

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_42_1760776552

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2404)

this model