train_copa_1757340279

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

Loss: 0.0109
Num Input Tokens Seen: 281312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.109	0.5	45	0.0421	14144
0.2402	1.0	90	0.0356	28192
0.0435	1.5	135	0.0161	42208
0.0165	2.0	180	0.0145	56256
0.0101	2.5	225	0.0109	70368
0.0	3.0	270	0.0160	84320
0.0	3.5	315	0.0169	98400
0.0	4.0	360	0.0245	112416
0.0	4.5	405	0.0245	126496
0.0	5.0	450	0.0245	140544
0.0	5.5	495	0.0245	154592
0.0	6.0	540	0.0255	168768
0.0	6.5	585	0.0255	182848
0.0	7.0	630	0.0255	196896
0.0	7.5	675	0.0255	210912
0.0	8.0	720	0.0245	225024
0.0	8.5	765	0.0255	239200
0.0	9.0	810	0.0255	253152
0.0	9.5	855	0.0265	267040
0.0	10.0	900	0.0245	281312

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1757340279

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2406)

this model