mms_e5_arm_a_real

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 200
training_steps: 4000

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.6895	5.0038	250	1.8122	0.5119	0.3289
6.7394	10.0075	500	1.3479	0.4807	0.2864
2.2666	16.0003	750	1.2666	0.4223	0.2601
4.6361	21.0045	1000	1.0766	0.4128	0.2380
2.7945	26.0085	1250	1.0817	0.4087	0.2394
3.8126	32.0012	1500	1.0507	0.4087	0.2365
2.3847	37.0052	1750	0.9586	0.3958	0.2287
2.0932	42.0093	2000	0.9065	0.3768	0.2184
2.3490	48.002	2250	0.8514	0.3659	0.2069
1.3556	53.0063	2500	0.7799	0.3469	0.1977
4.0664	58.0108	2750	0.7398	0.3517	0.1928
0.9701	64.0032	3000	0.7640	0.3347	0.1891
3.1716	69.0075	3250	0.6768	0.3313	0.1801
2.3314	74.0113	3500	0.7003	0.3252	0.1812
2.0345	80.0037	3750	0.6640	0.3191	0.1719
4.1949	85.0075	4000	0.6606	0.3136	0.1732

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(419)

this model