You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SNAC-Denoiser-LLaMA-500M-snac_v3_test_1gpu

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 16
optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 124
training_steps: 6248

Training Loss	Epoch	Step	Validation Loss
9.2855	0.0160	100	9.2853
9.1344	0.0320	200	9.1912
9.0779	0.0480	300	9.1123
8.8923	0.0640	400	8.8849
8.6169	0.0800	500	8.6110
8.4296	0.0960	600	8.4493
8.3297	0.1120	700	8.3578
8.2512	0.1280	800	8.2896
8.1568	0.1440	900	8.2190
8.0756	0.1600	1000	8.1627
8.0368	0.1760	1100	8.1142
7.9843	0.1920	1200	8.0759
7.9684	0.2080	1300	8.0442
7.919	0.2240	1400	8.0067
7.897	0.2400	1500	7.9785
7.8377	0.2560	1600	7.9468
7.8362	0.2720	1700	7.9263
7.7771	0.2881	1800	7.8941
7.7576	0.3041	1900	7.8713
7.7211	0.3201	2000	7.8481
7.7182	0.3361	2100	7.8241
7.7132	0.3521	2200	7.8073
7.675	0.3681	2300	7.7893
7.6257	0.3841	2400	7.7664
7.6289	0.4001	2500	7.7529
7.6152	0.4161	2600	7.7354
7.5542	0.4321	2700	7.7168
7.551	0.4481	2800	7.7024
7.5289	0.4641	2900	7.6855
7.5265	0.4801	3000	7.6714
7.4856	0.4961	3100	7.6539
7.4539	0.5121	3200	7.6411
7.462	0.5281	3300	7.6277
7.4749	0.5441	3400	7.6173
7.4562	0.5601	3500	7.6065
7.4682	0.5761	3600	7.5937
7.4372	0.5921	3700	7.5834
7.389	0.6081	3800	7.5721
7.3654	0.6241	3900	7.5634
7.3942	0.6401	4000	7.5573
7.4089	0.6561	4100	7.5477
7.3928	0.6721	4200	7.5431
7.3939	0.6881	4300	7.5341
7.3677	0.7041	4400	7.5271
7.3579	0.7201	4500	7.5234
7.3494	0.7361	4600	7.5187
7.3404	0.7521	4700	7.5138
7.3378	0.7681	4800	7.5102
7.3622	0.7841	4900	7.5077
7.3294	0.8001	5000	7.5056
7.3326	0.8161	5100	7.5024
7.3444	0.8321	5200	7.4992
7.3385	0.8482	5300	7.4995
7.3636	0.8642	5400	7.4961
7.3138	0.8802	5500	7.4957
7.3213	0.8962	5600	7.4956
7.3541	0.9122	5700	7.4941
7.2924	0.9282	5800	7.4938
7.3449	0.9442	5900	7.4931
7.347	0.9602	6000	7.4931
7.2718	0.9762	6100	7.4930
7.3641	0.9922	6200	7.4929

Safetensors

Model size

0.5B params

Tensor type

F32