train_conala_789_1760637893

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.6211
Num Input Tokens Seen: 3037136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5746	1.0	536	0.6938	152296
0.9156	2.0	1072	0.6474	304440
0.6473	3.0	1608	0.6398	455928
0.6308	4.0	2144	0.6331	608072
0.5751	5.0	2680	0.6211	759296
0.3883	6.0	3216	0.6417	910984
0.3301	7.0	3752	0.6460	1062816
0.4009	8.0	4288	0.6645	1214520
0.4998	9.0	4824	0.7020	1366480
0.2889	10.0	5360	0.7202	1518976
0.2201	11.0	5896	0.7463	1670320
0.2217	12.0	6432	0.7905	1822624
0.1369	13.0	6968	0.9357	1974336
0.1115	14.0	7504	0.8982	2126488
0.0777	15.0	8040	1.0092	2278280
0.0308	16.0	8576	1.0900	2430272
0.032	17.0	9112	1.1261	2581848
0.0249	18.0	9648	1.1317	2733712
0.0549	19.0	10184	1.1327	2885208
0.0501	20.0	10720	1.1333	3037136

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_789_1760637893

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2390)

this model