ssc-bas-mms-model-mix-adapt-max

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
1.8024	0.8457	200	0.9549	0.3456	0.8364
0.6265	1.6892	400	0.3335	0.1724	0.5245
0.4726	2.5328	600	0.3012	0.1605	0.4955
0.4104	3.3763	800	0.2447	0.1556	0.4743
0.3665	4.2199	1000	0.2375	0.1521	0.4667
0.3718	5.0634	1200	0.2157	0.1533	0.4707
0.3332	5.9091	1400	0.2230	0.1497	0.4598
0.3031	6.7526	1600	0.2056	0.1482	0.4495
0.3009	7.5962	1800	0.2162	0.1467	0.4528
0.2726	8.4397	2000	0.2005	0.1468	0.4504
0.2568	9.2833	2200	0.2253	0.1439	0.4401
0.2519	10.1268	2400	0.2007	0.1482	0.4640
0.2307	10.9725	2600	0.1955	0.1435	0.4359
0.2312	11.8161	2800	0.2008	0.1433	0.4428
0.215	12.6596	3000	0.1976	0.1465	0.4495
0.2181	13.5032	3200	0.1902	0.1442	0.4404
0.1896	14.3467	3400	0.1968	0.1414	0.4319
0.1955	15.1903	3600	0.2035	0.1424	0.4383
0.1939	16.0338	3800	0.1936	0.1429	0.4316
0.1782	16.8795	4000	0.2116	0.1422	0.4398
0.18	17.7230	4200	0.1914	0.1421	0.4319
0.1682	18.5666	4400	0.2093	0.1433	0.4383
0.164	19.4101	4600	0.1884	0.1427	0.4359
0.1609	20.2537	4800	0.2052	0.1420	0.4374
0.1469	21.0973	5000	0.1962	0.1400	0.4292
0.1404	21.9429	5200	0.1941	0.1407	0.4283
0.1434	22.7865	5400	0.1969	0.1412	0.4332
0.1465	23.6300	5600	0.1920	0.1397	0.4247
0.1295	24.4736	5800	0.1923	0.1401	0.4298
0.132	25.3171	6000	0.1999	0.1395	0.4247
0.121	26.1607	6200	0.1948	0.1405	0.4304
0.1387	27.0042	6400	0.1953	0.1385	0.4256
0.126	27.8499	6600	0.1912	0.1383	0.4241
0.1143	28.6934	6800	0.1917	0.1396	0.4265
0.1147	29.5370	7000	0.1911	0.1387	0.4253

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(371)

this model