train_conala_42_1767887005

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.6550
Num Input Tokens Seen: 1383952

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.7415	0.5005	536	1.0867	69344
0.9714	1.0009	1072	0.7939	138680
0.7071	1.5014	1608	0.7218	207768
0.728	2.0019	2144	0.7000	277336
0.6037	2.5023	2680	0.6865	346344
0.6856	3.0028	3216	0.6751	415584
0.5057	3.5033	3752	0.6698	484640
0.4266	4.0037	4288	0.6619	554056
0.5238	4.5042	4824	0.6609	624024
0.4155	5.0047	5360	0.6602	692512
0.59	5.5051	5896	0.6639	762208
0.6139	6.0056	6432	0.6560	831208
0.4343	6.5061	6968	0.6605	900888
0.5686	7.0065	7504	0.6550	969456
0.6345	7.5070	8040	0.6588	1038512
0.5132	8.0075	8576	0.6592	1108504
0.562	8.5079	9112	0.6587	1177832
1.1331	9.0084	9648	0.6595	1247064
0.3262	9.5089	10184	0.6590	1316312

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.1+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_42_1767887005

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2394)

this model