train_conala_101112_1760638009

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 2.8390
Num Input Tokens Seen: 3060208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
2.8717	1.0	536	2.8652	153344
3.1674	2.0	1072	2.8579	306640
2.7841	3.0	1608	2.8461	459376
2.5266	4.0	2144	2.8429	612008
2.68	5.0	2680	2.8417	764936
2.8456	6.0	3216	2.8408	917624
2.7411	7.0	3752	2.8411	1070488
2.4935	8.0	4288	2.8417	1223384
2.472	9.0	4824	2.8419	1376240
2.2903	10.0	5360	2.8403	1529640
3.0498	11.0	5896	2.8401	1682336
2.6413	12.0	6432	2.8404	1835928
3.2015	13.0	6968	2.8417	1989136
2.4593	14.0	7504	2.8394	2142632
2.5468	15.0	8040	2.8412	2295280
2.8674	16.0	8576	2.8407	2447904
2.3855	17.0	9112	2.8428	2600776
2.9077	18.0	9648	2.8402	2753536
3.3871	19.0	10184	2.8390	2906984
2.2761	20.0	10720	2.8403	3060208

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_conala_101112_1760638009

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2162)

this model

rbelanec
/

train_conala_101112_1760638009

train_conala_101112_1760638009

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_conala_101112_1760638009

Evaluation results