train_winogrande_456_1760637842

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the winogrande dataset. It achieves the following results on the evaluation set:

Loss: 0.2314
Num Input Tokens Seen: 38395408

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2324	1.0	9090	0.2314	1919808
0.2335	2.0	18180	0.2314	3839104
0.2989	3.0	27270	0.2438	5758016
0.2355	4.0	36360	0.2437	7678560
0.2304	5.0	45450	0.2337	9598912
0.2293	6.0	54540	0.2326	11518656
0.2203	7.0	63630	0.2439	13438320
0.237	8.0	72720	0.2334	15358064
0.2302	9.0	81810	0.2354	17278064
0.2365	10.0	90900	0.2326	19196144
0.2323	11.0	99990	0.2324	21117200
0.2366	12.0	109080	0.2323	23037584
0.2343	13.0	118170	0.2322	24956720
0.2331	14.0	127260	0.2322	26875344
0.2353	15.0	136350	0.2321	28793344
0.228	16.0	145440	0.2324	30713568
0.2312	17.0	154530	0.2321	32635088
0.2301	18.0	163620	0.2321	34555376
0.2299	19.0	172710	0.2323	36474544
0.2322	20.0	181800	0.2322	38395408

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 4

Model tree for rbelanec/train_winogrande_456_1760637842

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2115)

this model

rbelanec
/

train_winogrande_456_1760637842

train_winogrande_456_1760637842

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_winogrande_456_1760637842

Evaluation results