ssc-bas-mms-model-mix-adapt-max2

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 12
eval_batch_size: 12
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 24
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
0.3798	1.1053	200	0.1401	0.1462	0.4580
0.3162	2.2105	400	0.1278	0.1414	0.4383
0.2731	3.3158	600	0.1230	0.1410	0.4404
0.2536	4.4211	800	0.1125	0.1384	0.4328
0.2607	5.5263	1000	0.1155	0.1384	0.4283
0.2389	6.6316	1200	0.1152	0.1382	0.4274
0.229	7.7368	1400	0.1230	0.1426	0.4407
0.2169	8.8421	1600	0.1126	0.1382	0.4253
0.2112	9.9474	1800	0.1158	0.1394	0.4338
0.2043	11.0499	2000	0.1110	0.1376	0.4241
0.1891	12.1551	2200	0.1121	0.1371	0.4253
0.1878	13.2604	2400	0.1112	0.1361	0.4171
0.1742	14.3657	2600	0.1073	0.1368	0.4217
0.1819	15.4709	2800	0.1156	0.1386	0.4310
0.1641	16.5762	3000	0.1097	0.1345	0.4129
0.1565	17.6814	3200	0.1110	0.1363	0.4214
0.1587	18.7867	3400	0.1117	0.1363	0.4171
0.1581	19.8920	3600	0.1106	0.1355	0.4162
0.165	20.9972	3800	0.1126	0.1356	0.4174
0.1365	22.0997	4000	0.1108	0.1346	0.4117
0.1338	23.2050	4200	0.1126	0.1353	0.4138
0.1307	24.3102	4400	0.1127	0.1363	0.4174
0.1374	25.4155	4600	0.1161	0.1362	0.4201
0.1251	26.5208	4800	0.1154	0.1352	0.4138
0.1305	27.6260	5000	0.1142	0.1352	0.4144
0.1297	28.7313	5200	0.1146	0.1357	0.4168
0.123	29.8366	5400	0.1138	0.1351	0.4147

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(371)

this model