train_codealpacapy_789_1767696355

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

Loss: 0.4895
Num Input Tokens Seen: 24964664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.7402	1.0	1908	0.7119	1246616
0.558	2.0	3816	0.5744	2496088
0.5171	3.0	5724	0.5381	3746728
0.6385	4.0	7632	0.5215	4999712
0.4991	5.0	9540	0.5123	6245072
0.5417	6.0	11448	0.5062	7491776
0.4968	7.0	13356	0.5018	8735728
0.4574	8.0	15264	0.4986	9981168
0.609	9.0	17172	0.4963	11227560
0.4804	10.0	19080	0.4946	12474576
0.378	11.0	20988	0.4932	13719896
0.6324	12.0	22896	0.4920	14970024
0.434	13.0	24804	0.4912	16222408
0.5615	14.0	26712	0.4906	17474664
0.5465	15.0	28620	0.4903	18722440
0.439	16.0	30528	0.4898	19970248
0.5256	17.0	32436	0.4899	21217120
0.4535	18.0	34344	0.4896	22462808
0.4042	19.0	36252	0.4895	23715264
0.668	20.0	38160	0.4899	24964664

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_789_1767696355

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2400)

this model