Whisper Small ar

This model is a fine-tuned version of openai/whisper-small on the Common Voice 17.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3340
  • Wer: 27.1515
  • Cer: 7.9929

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.04
  • training_steps: 18000

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.5511 0.0556 1000 0.4263 38.7837 12.1682
0.2273 0.1111 2000 0.3858 34.5513 10.7969
0.1023 0.1667 3000 0.3663 33.5690 10.3863
0.0545 0.2222 4000 0.3567 31.5786 9.2661
0.043 0.2778 5000 0.3421 31.7236 9.3731
0.0254 0.3333 6000 0.3316 30.0600 9.0426
0.0219 0.3889 7000 0.3269 29.6451 8.7922
0.0177 0.4444 8000 0.3258 29.2705 8.7774
0.0209 0.5 9000 0.3157 28.5177 8.5056
0.0212 0.5556 10000 0.3105 28.9345 8.4034
0.0093 0.6111 11000 0.3111 27.8052 8.1165
0.012 0.6667 12000 0.3158 27.7042 8.2345
0.0124 0.7222 13000 0.3119 27.0304 7.9191
0.005 1.0393 14000 0.3392 27.5739 8.0862
0.008 1.0949 15000 0.3334 27.3590 7.9829
0.0049 1.1504 16000 0.3451 27.2911 8.0076
0.0032 1.206 17000 0.3468 27.0873 7.9693
0.0031 1.2616 18000 0.3340 27.1515 7.9929

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu121
  • Datasets 3.6.0
  • Tokenizers 0.21.0

Citation

Please cite the model using the following BibTeX entry:

@misc{deepdml/whisper-small-ar-mix-norm,
      title={Fine-tuned Whisper small ASR model for speech recognition in Arabic},
      author={Jimenez, David},
      howpublished={\url{https://huggingface.co/deepdml/whisper-small-ar-mix-norm}},
      year={2026}
    }
Downloads last month
278
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for deepdml/whisper-small-ar-mix-norm

Finetuned
(3341)
this model

Datasets used to train deepdml/whisper-small-ar-mix-norm

Evaluation results