mozilla-ai
/

whisper-small-fr

@@ -1,118 +1,35 @@
 ---
-library_name: transformers
-license: apache-2.0
 base_model: openai/whisper-small
-tags:
-- generated_from_trainer
 datasets:
-- common_voice_17_0
-metrics:
-- wer
 model-index:
-- name: whisper-small-fr
   results:
   - task:
-      name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: common_voice_17_0
-      type: common_voice_17_0
-      config: fr
-      split: test
-      args: fr
     metrics:
-    - name: Wer
-      type: wer
-      value: 23.51069802258125
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# whisper-small-fr
-This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the common_voice_17_0 dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.4490
-- Model Preparation Time: 0.0042
-- Wer: 23.5107
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 258
-- eval_batch_size: 64
-- seed: 42
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
-- training_steps: 2000
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch   | Step | Validation Loss | Model Preparation Time | Wer     |
-|:-------------:|:-------:|:----:|:---------------:|:----------------------:|:-------:|
-| 0.6515        | 0.6410  | 50   | 0.4803          | 0.0042                 | 25.5359 |
-| 0.3297        | 1.2821  | 100  | 0.4457          | 0.0042                 | 23.9349 |
-| 0.2733        | 1.9231  | 150  | 0.4333          | 0.0042                 | 23.5294 |
-| 0.2003        | 2.5641  | 200  | 0.4395          | 0.0042                 | 23.5419 |
-| 0.1763        | 3.2051  | 250  | 0.4479          | 0.0042                 | 23.7748 |
-| 0.1354        | 3.8462  | 300  | 0.4490          | 0.0042                 | 23.5107 |
-| 0.105         | 4.4872  | 350  | 0.4650          | 0.0042                 | 23.8871 |
-| 0.089         | 5.1282  | 400  | 0.4833          | 0.0042                 | 24.1844 |
-| 0.0629        | 5.7692  | 450  | 0.4929          | 0.0042                 | 24.3882 |
-| 0.0499        | 6.4103  | 500  | 0.5136          | 0.0042                 | 24.6647 |
-| 0.0406        | 7.0513  | 550  | 0.5226          | 0.0042                 | 24.5878 |
-| 0.0276        | 7.6923  | 600  | 0.5368          | 0.0042                 | 25.3197 |
-| 0.0238        | 8.3333  | 650  | 0.5504          | 0.0042                 | 24.6003 |
-| 0.0193        | 8.9744  | 700  | 0.5593          | 0.0042                 | 25.0140 |
-| 0.0146        | 9.6154  | 750  | 0.5718          | 0.0042                 | 25.0203 |
-| 0.0133        | 10.2564 | 800  | 0.5816          | 0.0042                 | 25.0556 |
-| 0.0115        | 10.8974 | 850  | 0.5867          | 0.0042                 | 24.9849 |
-| 0.0099        | 11.5385 | 900  | 0.5946          | 0.0042                 | 25.0120 |
-| 0.0091        | 12.1795 | 950  | 0.6006          | 0.0042                 | 24.9787 |
-| 0.0081        | 12.8205 | 1000 | 0.6056          | 0.0042                 | 25.1471 |
-| 0.0075        | 13.4615 | 1050 | 0.6114          | 0.0042                 | 25.0972 |
-| 0.0072        | 14.1026 | 1100 | 0.6166          | 0.0042                 | 25.0993 |
-| 0.0065        | 14.7436 | 1150 | 0.6198          | 0.0042                 | 25.1430 |
-| 0.0062        | 15.3846 | 1200 | 0.6249          | 0.0042                 | 25.2968 |
-| 0.006         | 16.0256 | 1250 | 0.6270          | 0.0042                 | 25.1637 |
-| 0.0055        | 16.6667 | 1300 | 0.6311          | 0.0042                 | 25.1741 |
-| 0.0053        | 17.3077 | 1350 | 0.6344          | 0.0042                 | 25.2428 |
-| 0.0051        | 17.9487 | 1400 | 0.6371          | 0.0042                 | 25.2677 |
-| 0.0048        | 18.5897 | 1450 | 0.6397          | 0.0042                 | 25.3072 |
-| 0.0048        | 19.2308 | 1500 | 0.6418          | 0.0042                 | 25.2532 |
-| 0.0046        | 19.8718 | 1550 | 0.6443          | 0.0042                 | 25.3093 |
-| 0.0044        | 20.5128 | 1600 | 0.6460          | 0.0042                 | 25.2344 |
-| 0.0043        | 21.1538 | 1650 | 0.6479          | 0.0042                 | 25.2739 |
-| 0.0042        | 21.7949 | 1700 | 0.6493          | 0.0042                 | 25.2802 |
-| 0.0042        | 22.4359 | 1750 | 0.6506          | 0.0042                 | 25.3155 |
-| 0.0041        | 23.0769 | 1800 | 0.6519          | 0.0042                 | 25.2864 |
-| 0.004         | 23.7179 | 1850 | 0.6528          | 0.0042                 | 25.2719 |
-| 0.004         | 24.3590 | 1900 | 0.6531          | 0.0042                 | 25.2677 |
-| 0.0039        | 25.0    | 1950 | 0.6538          | 0.0042                 | 25.2781 |
-| 0.0039        | 25.6410 | 2000 | 0.6540          | 0.0042                 | 25.2802 |
-### Framework versions
-- Transformers 4.49.0
-- Pytorch 2.6.0+cu124
-- Datasets 3.3.2
-- Tokenizers 0.21.0

 ---
 base_model: openai/whisper-small
 datasets:
+- mozilla-foundation/common_voice_17_0
+language: fr
+library_name: transformers
+license: apache-2.0
 model-index:
+- name: Finetuned openai/whisper-small on French
   results:
   - task:
       type: automatic-speech-recognition
+      name: Speech-to-Text
     dataset:
+      name: Common Voice (French)
+      type: common_voice
     metrics:
+    - type: wer
+      value: 23.511
 ---
+# Finetuned openai/whisper-small on 20000 French training audio samples from mozilla-foundation/common_voice_17_0.
+This model was created from the Mozilla.ai Blueprint:
+[speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).
+## Evaluation results on 5000 audio samples of French:
+### Baseline model (before finetuning) on French
+- Word Error Rate: 30.304
+- Loss: 1.155
+### Finetuned model (after finetuning) on French
+- Word Error Rate: 23.511
+- Loss: 0.449