ssc-sco-mms-model

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 12
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
2.0529	0.2491	200	0.9213	0.2438	0.6057
1.9092	0.4981	400	0.8217	0.2253	0.5961
1.85	0.7472	600	0.7991	0.2205	0.5779
1.7687	0.9963	800	0.7940	0.2199	0.5727
1.8151	1.2453	1000	0.7970	0.2182	0.5831
1.7891	1.4944	1200	0.7940	0.2177	0.5835
1.7512	1.7435	1400	0.7717	0.2149	0.5776
1.7742	1.9925	1600	0.7779	0.2149	0.5780
1.7784	2.2416	1800	0.7787	0.2140	0.5807
1.7212	2.4907	2000	0.7709	0.2133	0.5745
1.7672	2.7397	2200	0.7663	0.2138	0.5739
1.7632	2.9888	2400	0.7594	0.2104	0.5667
1.7459	3.2379	2600	0.7738	0.2125	0.5743
1.7162	3.4869	2800	0.7740	0.2114	0.5780
1.7718	3.7360	3000	0.7873	0.2145	0.5867
1.7002	3.9851	3200	0.7640	0.2104	0.5758
1.7167	4.2341	3400	0.7606	0.2121	0.5738
1.7429	4.4832	3600	0.7545	0.2100	0.5653
1.7338	4.7323	3800	0.7525	0.2092	0.5639
1.7302	4.9813	4000	0.7529	0.2088	0.5615
1.6996	5.2304	4200	0.7480	0.2090	0.5634
1.7655	5.4795	4400	0.7445	0.2083	0.5649
1.7414	5.7285	4600	0.7449	0.2091	0.5617
1.6881	5.9776	4800	0.7520	0.2107	0.5675
1.7261	6.2267	5000	0.7571	0.2097	0.5730
1.7112	6.4757	5200	0.7438	0.2072	0.5601
1.7348	6.7248	5400	0.7438	0.2068	0.5573
1.6992	6.9738	5600	0.7372	0.2068	0.5573
1.7178	7.2229	5800	0.7350	0.2066	0.5545
1.7061	7.4720	6000	0.7332	0.2072	0.5554
1.6895	7.7210	6200	0.7353	0.2079	0.5571
1.7274	7.9701	6400	0.7317	0.2072	0.5534
1.7295	8.2192	6600	0.7314	0.2068	0.5517
1.6925	8.4682	6800	0.7308	0.2068	0.5525
1.7261	8.7173	7000	0.7280	0.2064	0.5485
1.762	8.9664	7200	0.7271	0.2067	0.5466
1.6809	9.2154	7400	0.7276	0.2068	0.5477
1.7149	9.4645	7600	0.7271	0.2066	0.5485
1.6764	9.7136	7800	0.7276	0.2067	0.5488
1.6877	9.9626	8000	0.7277	0.2065	0.5486

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(371)

this model