train_codealpacapy_42_1760729930

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

Loss: 0.4686
Num Input Tokens Seen: 24887720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.7231	1.0	1908	0.6796	1243048
0.4163	2.0	3816	0.5497	2489456
0.4516	3.0	5724	0.5161	3733736
0.5496	4.0	7632	0.5008	4976128
0.6014	5.0	9540	0.4919	6219592
0.6577	6.0	11448	0.4860	7467968
0.515	7.0	13356	0.4815	8709168
0.5101	8.0	15264	0.4783	9958360
0.4895	9.0	17172	0.4756	11204000
0.5218	10.0	19080	0.4736	12446408
0.4234	11.0	20988	0.4722	13691904
0.4803	12.0	22896	0.4711	14937216
0.6942	13.0	24804	0.4704	16179624
0.4349	14.0	26712	0.4696	17425368
0.3951	15.0	28620	0.4692	18668536
0.5009	16.0	30528	0.4689	19916008
0.3044	17.0	32436	0.4687	21158120
0.5018	18.0	34344	0.4686	22400368
0.4636	19.0	36252	0.4686	23645440
0.3769	20.0	38160	0.4686	24887720

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_42_1760729930

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2402)

this model