train_codealpacapy_42_1760705166

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

Loss: 0.4436
Num Input Tokens Seen: 24887720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.4698	1.0	1908	0.4576	1243048
0.3335	2.0	3816	0.4475	2489456
0.3801	3.0	5724	0.4436	3733736
0.4428	4.0	7632	0.4494	4976128
0.4908	5.0	9540	0.4700	6219592
0.3863	6.0	11448	0.5028	7467968
0.2981	7.0	13356	0.5452	8709168
0.2559	8.0	15264	0.6192	9958360
0.2353	9.0	17172	0.6835	11204000
0.1702	10.0	19080	0.7745	12446408
0.0753	11.0	20988	0.9030	13691904
0.0887	12.0	22896	1.0084	14937216
0.0529	13.0	24804	1.1341	16179624
0.0328	14.0	26712	1.2237	17425368
0.0106	15.0	28620	1.3184	18668536
0.0104	16.0	30528	1.4355	19916008
0.0065	17.0	32436	1.5085	21158120
0.0074	18.0	34344	1.5513	22400368
0.0034	19.0	36252	1.5742	23645440
0.0014	20.0	38160	1.5826	24887720

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_42_1760705166

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2389)

this model