Whisper Medium af

This model is a fine-tuned version of openai/whisper-medium on multiple datasets. It achieves the following results on the evaluation set:

  • Loss: 0.6508
  • Wer: 21.7911
  • Cer: 7.5309

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.04
  • training_steps: 4100

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.6031 0.0244 100 0.7367 30.9198 11.1248
0.3567 0.0488 200 0.6171 28.3908 11.3388
0.3101 0.0732 300 0.5763 24.5626 8.7914
0.1999 0.0976 400 0.5673 24.2855 8.3194
0.1453 0.1220 500 0.5739 23.0210 8.4191
0.1184 0.1463 600 0.5742 24.6666 8.2960
0.116 0.1707 700 0.5686 25.7059 9.6532
0.0941 0.1951 800 0.5905 23.8524 7.9618
0.1025 0.2195 900 0.6025 24.8398 9.4216
0.0889 0.2439 1000 0.5666 23.0383 7.9970
0.0492 0.2683 1100 0.5936 22.9863 8.0761
0.0554 0.2927 1200 0.6092 22.8477 8.0732
0.0448 0.3171 1300 0.6118 26.4854 10.9694
0.0358 0.3415 1400 0.6163 25.9484 10.0314
0.0454 0.3659 1500 0.6139 23.0729 7.8768
0.0286 0.3902 1600 0.6177 22.5533 7.9354
0.0395 0.4146 1700 0.6100 25.7752 10.1134
0.058 0.4390 1800 0.6200 23.1942 9.1607
0.0269 0.4634 1900 0.6309 23.1076 8.2168
0.0229 0.4878 2000 0.6334 23.3674 8.7474
0.0394 0.5122 2100 0.6262 22.8131 7.7976
0.0186 0.5366 2200 0.6295 25.0476 9.7060
0.0267 0.5610 2300 0.6427 22.9517 8.4718
0.0161 0.5854 2400 0.6334 23.0902 8.3986
0.023 0.6098 2500 0.6368 21.8604 7.6071
0.028 0.6341 2600 0.6319 22.0336 7.6540
0.0237 0.6585 2700 0.6296 22.9517 7.9002
0.0152 0.6829 2800 0.6601 22.3627 7.6129
0.0143 0.7073 2900 0.6544 21.8431 7.4605
0.0151 0.7317 3000 0.6541 22.5013 7.5191
0.0157 0.7561 3100 0.6583 22.6745 7.6862
0.0096 0.7805 3200 0.6594 22.4840 7.6569
0.0124 0.8049 3300 0.6476 21.8431 7.6716
0.02 0.8293 3400 0.6406 21.9816 7.7390
0.0204 0.8537 3500 0.6399 21.4966 7.6041
0.018 0.8780 3600 0.6446 21.8777 7.6979
0.0125 0.9024 3700 0.6585 21.8777 7.4927
0.0095 0.9268 3800 0.6601 21.9124 7.4752
0.0076 0.9512 3900 0.6571 22.0509 7.5279
0.0176 0.9756 4000 0.6521 21.7565 7.5191
0.0086 1.0 4100 0.6508 21.7911 7.5309

Framework versions

  • Transformers 4.42.0.dev0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1

Citation

Please cite the model using the following BibTeX entry:

@misc{deepdml/whisper-medium-af-mix-norm,
      title={Fine-tuned Whisper medium ASR model for speech recognition in Afrikaans},
      author={Jimenez, David},
      howpublished={\url{https://huggingface.co/deepdml/whisper-medium-af-mix-norm}},
      year={2026}
    }
Downloads last month
685
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for deepdml/whisper-medium-af-mix-norm

Finetuned
(817)
this model

Datasets used to train deepdml/whisper-medium-af-mix-norm

Evaluation results