train_conala_789_1760637897

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.7425
Num Input Tokens Seen: 3037136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
2.3584	1.0	536	2.2465	152296
1.6701	2.0	1072	1.1651	304440
1.0594	3.0	1608	0.9866	455928
0.9483	4.0	2144	0.9026	608072
0.79	5.0	2680	0.8593	759296
0.6601	6.0	3216	0.8306	910984
0.5273	7.0	3752	0.8102	1062816
0.8013	8.0	4288	0.7940	1214520
0.9372	9.0	4824	0.7806	1366480
0.69	10.0	5360	0.7705	1518976
0.586	11.0	5896	0.7630	1670320
0.5217	12.0	6432	0.7569	1822624
0.593	13.0	6968	0.7520	1974336
0.5876	14.0	7504	0.7481	2126488
0.7186	15.0	8040	0.7454	2278280
0.7165	16.0	8576	0.7435	2430272
0.6312	17.0	9112	0.7436	2581848
0.6236	18.0	9648	0.7429	2733712
0.7818	19.0	10184	0.7425	2885208
0.6213	20.0	10720	0.7425	3037136

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_789_1760637897

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2392)

this model