train_conala_101112_1760638006

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.5924
Num Input Tokens Seen: 3060208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.6042	1.0	536	0.6518	153344
1.0162	2.0	1072	0.6180	306640
0.7428	3.0	1608	0.6011	459376
0.5338	4.0	2144	0.5957	612008
0.4754	5.0	2680	0.6104	764936
0.4787	6.0	3216	0.5924	917624
0.3919	7.0	3752	0.6076	1070488
0.3262	8.0	4288	0.6097	1223384
0.3219	9.0	4824	0.6546	1376240
0.1384	10.0	5360	0.6881	1529640
0.2843	11.0	5896	0.7035	1682336
0.1502	12.0	6432	0.7622	1835928
0.1751	13.0	6968	0.8449	1989136
0.1135	14.0	7504	0.8852	2142632
0.0843	15.0	8040	0.9656	2295280
0.023	16.0	8576	1.0137	2447904
0.0671	17.0	9112	1.0530	2600776
0.0236	18.0	9648	1.0637	2753536
0.0458	19.0	10184	1.0654	2906984
0.0345	20.0	10720	1.0637	3060208

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 4

Model tree for rbelanec/train_conala_101112_1760638006

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2393)

this model