ssc-ady-mms-model-mix-adapt-max2

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 12
eval_batch_size: 12
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 24
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
0.6939	0.4132	200	0.5240	0.1948	0.8809
0.5164	0.8264	400	0.4230	0.1689	0.8082
0.4719	1.2397	600	0.3680	0.1512	0.7403
0.4443	1.6529	800	0.3637	0.1481	0.7297
0.4258	2.0661	1000	0.3406	0.1422	0.6982
0.4051	2.4793	1200	0.3356	0.1420	0.7092
0.3987	2.8926	1400	0.3173	0.1385	0.6826
0.3807	3.3058	1600	0.3281	0.1395	0.6910
0.3725	3.7190	1800	0.3255	0.1399	0.6867
0.3896	4.1322	2000	0.3136	0.1370	0.6771
0.3729	4.5455	2200	0.3116	0.1338	0.6642
0.3729	4.9587	2400	0.3186	0.1360	0.6683
0.3557	5.3719	2600	0.3153	0.1379	0.6728
0.3763	5.7851	2800	0.3302	0.1393	0.6819
0.3988	6.1983	3000	0.3593	0.1407	0.7120
0.4476	6.6116	3200	0.3716	0.1467	0.7161
0.457	7.0248	3400	0.3781	0.1428	0.7013
0.4784	7.4380	3600	0.4148	0.1444	0.7197
0.7014	7.8512	3800	0.5790	0.1576	0.7525
1.2074	8.2645	4000	1.2445	0.2668	0.9548
1.679	8.6777	4200	1.5282	0.4731	1.0718
1.7237	9.0909	4400	1.4973	0.4396	1.0677
1.6836	9.5041	4600	1.4758	0.3751	1.0416
1.7748	9.9174	4800	1.5688	0.3115	1.0067

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(346)

this model