ssc-ttj-mms-model

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 12
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
1.0495	0.2463	200	0.3003	0.0858	0.4553
1.016	0.4926	400	0.2813	0.0833	0.4464
1.0172	0.7389	600	0.2781	0.0829	0.4442
1.0097	0.9852	800	0.2720	0.0822	0.4393
0.9674	1.2315	1000	0.2654	0.0817	0.4393
0.9503	1.4778	1200	0.2619	0.0817	0.4368
0.9384	1.7241	1400	0.2598	0.0809	0.4347
0.9441	1.9704	1600	0.2577	0.0808	0.4355
1.0321	2.2167	1800	0.2553	0.0805	0.4320
0.949	2.4631	2000	0.2550	0.0803	0.4326
0.9281	2.7094	2200	0.2542	0.0807	0.4319
0.9555	2.9557	2400	0.2541	0.0804	0.4311
0.9329	3.2020	2600	0.2535	0.0804	0.4319
0.9786	3.4483	2800	0.2548	0.0802	0.4314
0.9575	3.6946	3000	0.2531	0.0802	0.4317
0.9381	3.9409	3200	0.2519	0.0801	0.43
0.9561	4.1872	3400	0.2526	0.0798	0.4292
0.945	4.4335	3600	0.2530	0.0804	0.4324
0.9401	4.6798	3800	0.2518	0.0794	0.4265
0.926	4.9261	4000	0.2544	0.0806	0.4339
0.932	5.1724	4200	0.2509	0.0797	0.4293
0.949	5.4187	4400	0.2500	0.0797	0.4314
0.951	5.6650	4600	0.2505	0.0800	0.4332
0.9221	5.9113	4800	0.2488	0.0795	0.4295
0.9168	6.1576	5000	0.2491	0.0796	0.4293
0.9188	6.4039	5200	0.2503	0.0797	0.4314
0.8848	6.6502	5400	0.2489	0.0793	0.4294
0.9561	6.8966	5600	0.2480	0.0794	0.4284
0.9438	7.1429	5800	0.2506	0.0798	0.4305
0.9402	7.3892	6000	0.2508	0.0797	0.4292
0.9369	7.6355	6200	0.2507	0.0794	0.4278
0.9201	7.8818	6400	0.2504	0.0792	0.4281
0.9243	8.1281	6600	0.2500	0.0794	0.4284
0.9347	8.3744	6800	0.2484	0.0792	0.4265
0.9072	8.6207	7000	0.2494	0.0792	0.4275
0.9002	8.8670	7200	0.2493	0.0794	0.4282
0.888	9.1133	7400	0.2489	0.0793	0.4272
0.9198	9.3596	7600	0.2495	0.0797	0.4292
0.916	9.6059	7800	0.2495	0.0794	0.4281
0.9148	9.8522	8000	0.2494	0.0795	0.4288

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(371)

this model