train_boolq_42_1760765194

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

Loss: 0.1217
Num Input Tokens Seen: 42773120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2257	1.0	2121	0.1217	2135488
0.1311	2.0	4242	0.1258	4271424
0.0597	3.0	6363	0.1464	6407520
0.1744	4.0	8484	0.1997	8553728
0.0012	5.0	10605	0.1826	10692704
0.0001	6.0	12726	0.2867	12829472
0.0006	7.0	14847	0.3035	14967104
0.0	8.0	16968	0.3620	17105760
0.0	9.0	19089	0.4334	19246048
0.0	10.0	21210	0.4320	21382880
0.0	11.0	23331	0.4605	23522528
0.0	12.0	25452	0.3978	25662176
0.0	13.0	27573	0.3807	27797760
0.0	14.0	29694	0.4015	29933184
0.0	15.0	31815	0.4407	32075552
0.0	16.0	33936	0.5046	34216384
0.0	17.0	36057	0.5390	36358080
0.0	18.0	38178	0.5702	38498720
0.0	19.0	40299	0.5860	40635680
0.0	20.0	42420	0.5856	42773120

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_42_1760765194

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2393)

this model