train_record_42_1767887029

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

Loss: 0.3422
Num Input Tokens Seen: 437806496

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5063	0.5	31242	0.4476	21898848
0.1726	1.0	62484	0.3965	43776656
0.2595	1.5	93726	0.3683	65652560
0.4387	2.0	124968	0.3669	87565488
0.2084	2.5	156210	0.3661	109452304
0.1666	3.0	187452	0.3422	131339088
0.4472	3.5	218694	0.3466	153229552
0.4956	4.0	249936	0.3591	175123504
0.3682	4.5	281178	0.3471	197013152
0.2085	5.0	312420	0.3514	218911872
0.1899	5.5	343662	0.3630	240801840
0.1348	6.0	374904	0.3488	262683472
0.3945	6.5	406146	0.3498	284576304
0.1822	7.0	437388	0.3578	306462144
0.3186	7.5	468630	0.3653	328347904
0.2993	8.0	499872	0.3467	350243504
0.674	8.5	531114	0.3593	372129760
0.5828	9.0	562356	0.3585	394023312
0.5389	9.5	593598	0.3613	415924976
0.3445	10.0	624840	0.3607	437806496

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.1+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 4

Model tree for rbelanec/train_record_42_1767887029

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2392)

this model