train_conala_42_1760637552

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.7380
Num Input Tokens Seen: 3049984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
2.7043	1.0	536	2.3140	153352
1.0339	2.0	1072	1.1814	305496
0.6569	3.0	1608	0.9902	458160
0.8537	4.0	2144	0.9015	610584
0.782	5.0	2680	0.8551	763216
0.6579	6.0	3216	0.8257	915528
0.8493	7.0	3752	0.8033	1067904
0.6719	8.0	4288	0.7876	1221016
0.7793	9.0	4824	0.7757	1373032
1.5815	10.0	5360	0.7658	1525104
0.8994	11.0	5896	0.7571	1677680
0.7936	12.0	6432	0.7522	1830200
0.6393	13.0	6968	0.7468	1982664
0.4894	14.0	7504	0.7442	2135168
0.7575	15.0	8040	0.7411	2287232
0.5909	16.0	8576	0.7401	2438992
0.5283	17.0	9112	0.7387	2591432
0.8665	18.0	9648	0.7380	2744944
0.7856	19.0	10184	0.7385	2897552
0.597	20.0	10720	0.7382	3049984

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_42_1760637552

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2393)

this model