train_codealpacapy_101112_1770438262

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

Loss: 2.6960
Num Input Tokens Seen: 24913312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.6502	1.0	1908	0.5047	1245424
0.5121	2.0	3816	0.4964	2489840
0.4514	3.0	5724	0.4886	3736568
0.4348	4.0	7632	0.4900	4984176
0.4158	5.0	9540	0.4862	6229240
0.4036	6.0	11448	0.4906	7476480
0.425	7.0	13356	0.4846	8722408
0.4785	8.0	15264	0.4832	9967176
0.3771	9.0	17172	0.4847	11214472
0.3931	10.0	19080	0.4845	12463312
0.5228	11.0	20988	0.4846	13708664
0.4694	12.0	22896	0.4867	14951560
0.471	13.0	24804	0.4856	16193456
0.4008	14.0	26712	0.4889	17439088
0.4933	15.0	28620	0.4885	18687760
0.3565	16.0	30528	0.4923	19936088
0.3716	17.0	32436	0.4926	21184936
0.3937	18.0	34344	0.4931	22429992
0.3256	19.0	36252	0.4940	23671264
0.3567	20.0	38160	0.4942	24913312

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_101112_1770438262

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2389)

this model