train_copa_101112_1760637987

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

Loss: 0.2299
Num Input Tokens Seen: 562848

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.212	1.0	90	0.2939	28192
0.2497	2.0	180	0.2331	56256
0.233	3.0	270	0.2313	84320
0.2341	4.0	360	0.2359	112416
0.2286	5.0	450	0.2314	140544
0.2325	6.0	540	0.2320	168768
0.2499	7.0	630	0.2315	196896
0.2314	8.0	720	0.2319	225024
0.2294	9.0	810	0.2326	253152
0.2266	10.0	900	0.2324	281312
0.2313	11.0	990	0.2336	309280
0.2324	12.0	1080	0.2319	337536
0.2284	13.0	1170	0.2315	365632
0.2274	14.0	1260	0.2299	393632
0.2273	15.0	1350	0.2310	421696
0.2305	16.0	1440	0.2310	449984
0.2294	17.0	1530	0.2300	478016
0.2264	18.0	1620	0.2315	506272
0.2306	19.0	1710	0.2320	534432
0.2307	20.0	1800	0.2326	562848

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_copa_101112_1760637987

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2389)

this model