train_codealpacapy_42_1760719585

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

Loss: 0.4525
Num Input Tokens Seen: 24887720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5784	1.0	1908	0.5529	1243048
0.3809	2.0	3816	0.4983	2489456
0.4314	3.0	5724	0.4791	3733736
0.5146	4.0	7632	0.4709	4976128
0.5703	5.0	9540	0.4652	6219592
0.616	6.0	11448	0.4614	7467968
0.487	7.0	13356	0.4585	8709168
0.4804	8.0	15264	0.4569	9958360
0.4684	9.0	17172	0.4552	11204000
0.4914	10.0	19080	0.4542	12446408
0.3796	11.0	20988	0.4539	13691904
0.4404	12.0	22896	0.4535	14937216
0.6154	13.0	24804	0.4534	16179624
0.4073	14.0	26712	0.4528	17425368
0.3595	15.0	28620	0.4529	18668536
0.4605	16.0	30528	0.4528	19916008
0.2781	17.0	32436	0.4528	21158120
0.4609	18.0	34344	0.4528	22400368
0.4204	19.0	36252	0.4525	23645440
0.3394	20.0	38160	0.4527	24887720

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_42_1760719585

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2401)

this model