deepdml
/

whisper-tiny-ar-mix-norm

@@ -1,4 +1,5 @@
 ---
 language:
 - ar
 license: apache-2.0
@@ -6,11 +7,11 @@ base_model: openai/whisper-tiny
 tags:
 - generated_from_trainer
 datasets:
 - ymoslem/MediaSpeech
 - deepdml/Tunisian_MSA
 - UBC-NLP/Casablanca
 - fixie-ai/common_voice_17_0
-- google/fleurs
 metrics:
 - wer
 model-index:
@@ -21,12 +22,13 @@ model-index:
       type: automatic-speech-recognition
     dataset:
       name: Common Voice 17.0
-      type: ymoslem/MediaSpeech
     metrics:
     - name: Wer
       type: wer
-      value: 60.61585354657461
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
@@ -34,9 +36,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the Common Voice 17.0 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.7162
-- Wer: 60.6159
-- Cer: 21.8903
 ## Model description
@@ -59,38 +61,38 @@ The following hyperparameters were used during training:
 - train_batch_size: 64
 - eval_batch_size: 64
 - seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.04
-- training_steps: 5000
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Wer     | Cer     |
-|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
-| 0.8928        | 0.2   | 1000 | 0.7965          | 66.8809 | 25.5809 |
-| 0.6731        | 0.4   | 2000 | 0.7496          | 63.1479 | 23.1687 |
-| 0.5235        | 0.6   | 3000 | 0.7214          | 61.6845 | 22.3557 |
-| 0.4641        | 0.8   | 4000 | 0.7161          | 60.8490 | 21.9854 |
-| 0.4296        | 1.0   | 5000 | 0.7162          | 60.6159 | 21.8903 |
 ### Framework versions
-- Transformers 4.42.0.dev0
-- Pytorch 2.3.0+cu121
-- Datasets 2.19.1
-- Tokenizers 0.19.1
-## Citation
-Please cite the model using the following BibTeX entry:
-```bibtex
-@misc{deepdml/whisper-tiny-ar-mix-norm,
-      title={Fine-tuned Whisper tiny ASR model for speech recognition in Arabic},
-      author={Jimenez, David},
-      howpublished={\url{https://huggingface.co/deepdml/whisper-tiny-ar-mix-norm}},
-      year={2026}
-    }
-```

 ---
+library_name: transformers
 language:
 - ar
 license: apache-2.0
 tags:
 - generated_from_trainer
 datasets:
+- google/fleurs
 - ymoslem/MediaSpeech
 - deepdml/Tunisian_MSA
 - UBC-NLP/Casablanca
 - fixie-ai/common_voice_17_0
 metrics:
 - wer
 model-index:
       type: automatic-speech-recognition
     dataset:
       name: Common Voice 17.0
+      type: google/fleurs
     metrics:
     - name: Wer
       type: wer
+      value: 52.17678705862912
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the Common Voice 17.0 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6321
+- Wer: 52.1768
+- Cer: 18.3597
 ## Model description
 - train_batch_size: 64
 - eval_batch_size: 64
 - seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.04
+- training_steps: 18000
 ### Training results
+| Training Loss | Epoch  | Step  | Validation Loss | Wer     | Cer     |
+|:-------------:|:------:|:-----:|:---------------:|:-------:|:-------:|
+| 0.9731        | 0.0556 | 1000  | 0.8246          | 68.5849 | 26.4636 |
+| 0.6833        | 0.1111 | 2000  | 0.7503          | 63.3554 | 23.5338 |
+| 0.4756        | 0.1667 | 3000  | 0.7112          | 60.5773 | 21.8069 |
+| 0.3473        | 0.2222 | 4000  | 0.7019          | 59.5509 | 21.6413 |
+| 0.2547        | 0.2778 | 5000  | 0.6910          | 59.1212 | 21.5653 |
+| 0.1777        | 0.3333 | 6000  | 0.6924          | 57.6816 | 20.6340 |
+| 0.128         | 1.0197 | 7000  | 0.6828          | 57.0996 | 20.5314 |
+| 0.11          | 1.0752 | 8000  | 0.6706          | 56.0768 | 20.2707 |
+| 0.0869        | 1.1308 | 9000  | 0.6622          | 55.4654 | 20.0036 |
+| 0.0714        | 1.1863 | 10000 | 0.6506          | 54.8448 | 19.6163 |
+| 0.0594        | 1.2419 | 11000 | 0.6427          | 54.9714 | 19.4470 |
+| 0.0541        | 1.2974 | 12000 | 0.6365          | 53.4089 | 19.0258 |
+| 0.0484        | 1.353  | 13000 | 0.6371          | 53.7100 | 19.1604 |
+| 0.0445        | 2.0393 | 14000 | 0.6359          | 53.8697 | 19.4603 |
+| 0.042         | 2.0949 | 15000 | 0.6348          | 52.5403 | 18.6839 |
+| 0.0346        | 2.1504 | 16000 | 0.6317          | 52.7809 | 18.6861 |
+| 0.0339        | 2.206  | 17000 | 0.6436          | 52.5128 | 18.5581 |
+| 0.0404        | 2.2616 | 18000 | 0.6321          | 52.1768 | 18.3597 |
 ### Framework versions
+- Transformers 4.48.0.dev0
+- Pytorch 2.5.1+cu121
+- Datasets 3.6.0
+- Tokenizers 0.21.0