120min_mms-1b_FT

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0318
  • Wer: 0.7998
  • Cer: 0.2752

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
2.2829 0.1908 100 1.4130 0.9190 0.3455
1.522 0.3817 200 1.2386 0.8383 0.2985
1.6084 0.5725 300 1.1569 0.8213 0.2939
1.5849 0.7634 400 1.1120 0.8298 0.2895
1.4867 0.9542 500 1.1229 0.8103 0.2879
1.5095 1.1450 600 1.1402 0.8075 0.2848
1.4628 1.3359 700 1.0958 0.8113 0.2870
1.5158 1.5267 800 1.0816 0.8187 0.2846
1.5189 1.7176 900 1.1021 0.8041 0.2808
1.5805 1.9084 1000 1.0749 0.8083 0.2852
1.5497 2.0992 1100 1.0989 0.8109 0.2856
1.4625 2.2901 1200 1.0915 0.8100 0.2835
1.461 2.4809 1300 1.0944 0.8034 0.2794
1.464 2.6718 1400 1.0930 0.8221 0.2811
1.5009 2.8626 1500 1.0743 0.8295 0.2921
1.3417 3.0534 1600 1.0585 0.7992 0.2794
1.454 3.2443 1700 1.0635 0.8135 0.2855
1.5585 3.4351 1800 1.0317 0.8189 0.2846
1.3946 3.6260 1900 1.0401 0.7963 0.2796
1.4687 3.8168 2000 1.0432 0.8097 0.2832
1.5297 4.0076 2100 1.0506 0.8030 0.2824
1.4802 4.1985 2200 1.0595 0.8096 0.2825
1.3328 4.3893 2300 1.0318 0.7998 0.2752

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.9.0+cu126
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for khier12/120min_mms-1b_FT

Finetuned
(407)
this model