train_conala_789_1760637896

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 2.7424
Num Input Tokens Seen: 3037136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
2.9899	1.0	536	2.7686	152296
3.1267	2.0	1072	2.7586	304440
3.6797	3.0	1608	2.7509	455928
2.7123	4.0	2144	2.7465	608072
2.7728	5.0	2680	2.7443	759296
2.5596	6.0	3216	2.7442	910984
3.1077	7.0	3752	2.7441	1062816
3.0818	8.0	4288	2.7452	1214520
2.7164	9.0	4824	2.7428	1366480
2.721	10.0	5360	2.7432	1518976
2.6167	11.0	5896	2.7453	1670320
2.4633	12.0	6432	2.7437	1822624
2.4093	13.0	6968	2.7427	1974336
2.4403	14.0	7504	2.7441	2126488
3.2799	15.0	8040	2.7443	2278280
3.0543	16.0	8576	2.7424	2430272
2.6215	17.0	9112	2.7448	2581848
2.8104	18.0	9648	2.7450	2733712
3.0447	19.0	10184	2.7437	2885208
2.625	20.0	10720	2.7437	3037136

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_789_1760637896

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2393)

this model