train_conala_456_1760637783

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.7393
Num Input Tokens Seen: 3043720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
2.5981	1.0	536	2.2582	152088
0.9095	2.0	1072	1.1626	303784
0.8483	3.0	1608	0.9730	456552
0.8116	4.0	2144	0.8966	608704
0.7813	5.0	2680	0.8562	761184
1.0255	6.0	3216	0.8292	912912
0.7137	7.0	3752	0.8086	1065128
0.7207	8.0	4288	0.7931	1216496
0.9265	9.0	4824	0.7808	1368880
0.5225	10.0	5360	0.7696	1522016
0.7305	11.0	5896	0.7616	1674136
1.066	12.0	6432	0.7548	1826160
0.7107	13.0	6968	0.7501	1978984
0.5841	14.0	7504	0.7460	2130656
0.4122	15.0	8040	0.7440	2282720
0.8114	16.0	8576	0.7409	2434896
0.922	17.0	9112	0.7403	2586968
0.7926	18.0	9648	0.7401	2738448
0.4927	19.0	10184	0.7393	2891056
0.3725	20.0	10720	0.7396	3043720

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_456_1760637783

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2392)

this model