train_conala_456_1760637779

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.6224
Num Input Tokens Seen: 3043720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.7796	1.0	536	0.6564	152088
0.5925	2.0	1072	4.2643	303784
0.4671	3.0	1608	0.6282	456552
0.5547	4.0	2144	0.6224	608704
0.539	5.0	2680	0.6358	761184
0.6204	6.0	3216	0.6471	912912
0.4443	7.0	3752	0.6642	1065128
0.432	8.0	4288	0.6942	1216496
0.4643	9.0	4824	0.6956	1368880
0.2251	10.0	5360	0.7665	1522016
0.2757	11.0	5896	0.7932	1674136
0.3017	12.0	6432	0.8299	1826160
0.1997	13.0	6968	0.9332	1978984
0.086	14.0	7504	1.0294	2130656
0.0632	15.0	8040	1.1322	2282720
0.0933	16.0	8576	1.1970	2434896
0.0596	17.0	9112	1.2180	2586968
0.0668	18.0	9648	1.2268	2738448
0.0231	19.0	10184	1.2259	2891056
0.0237	20.0	10720	1.2293	3043720

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_456_1760637779

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2391)

this model