train_conala_456_1760637780

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 2.0248
Num Input Tokens Seen: 3043720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.7785	1.0	536	0.6903	152088
0.5883	2.0	1072	0.6959	303784
0.5464	3.0	1608	0.6616	456552
0.568	4.0	2144	0.6424	608704
0.5957	5.0	2680	0.6443	761184
0.723	6.0	3216	0.6424	912912
0.5193	7.0	3752	0.6437	1065128
0.5701	8.0	4288	0.6507	1216496
0.6251	9.0	4824	0.6445	1368880
0.3634	10.0	5360	0.6633	1522016
0.481	11.0	5896	0.6657	1674136
0.6605	12.0	6432	0.6991	1826160
0.4597	13.0	6968	0.7306	1978984
0.3195	14.0	7504	0.7241	2130656
0.1868	15.0	8040	0.7772	2282720
0.4085	16.0	8576	0.7817	2434896
0.4356	17.0	9112	0.8402	2586968
0.225	18.0	9648	0.8539	2738448
0.1774	19.0	10184	0.8778	2891056
0.1177	20.0	10720	0.8808	3043720

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_conala_456_1760637780

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2392)

this model