train_conala_1755694511

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 1.2638
Num Input Tokens Seen: 1382584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.9354	0.5005	536	0.8337	68880
0.9609	1.0009	1072	0.7219	138320
0.5536	1.5014	1608	0.6741	207744
0.3862	2.0019	2144	0.6362	276856
0.6441	2.5023	2680	0.6552	346040
0.582	3.0028	3216	0.6596	415184
0.3643	3.5033	3752	0.6909	484576
0.2223	4.0037	4288	0.7160	553632
0.1992	4.5042	4824	0.7488	623280
0.1908	5.0047	5360	0.7194	691912
0.223	5.5051	5896	0.8461	762008
0.1581	6.0056	6432	0.8329	830744
0.037	6.5061	6968	0.9954	900568
0.0216	7.0065	7504	0.9716	969200
0.095	7.5070	8040	1.0835	1037856
0.0669	8.0075	8576	1.0836	1107480
0.1067	8.5079	9112	1.2072	1176200
0.0466	9.0084	9648	1.2154	1245744
0.0126	9.5089	10184	1.2640	1314112

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_1755694511

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2392)

this model