mozilla-ai
/

whisper-small-sv

@@ -1,103 +1,34 @@
 ---
-library_name: transformers
-license: apache-2.0
 base_model: openai/whisper-small
-tags:
-- generated_from_trainer
 datasets:
-- common_voice_17_0
-metrics:
-- wer
 model-index:
-- name: whisper-small-swedish
   results:
   - task:
-      name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: common_voice_17_0
-      type: common_voice_17_0
-      config: sv-SE
-      split: test
-      args: sv-SE
     metrics:
-    - name: Wer
-      type: wer
-      value: 20.14691254112786
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# whisper-small-swedish
-This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the common_voice_17_0 dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.3111
-- Model Preparation Time: 0.0044
-- Wer: 20.1469
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 258
-- eval_batch_size: 64
-- seed: 42
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
-- training_steps: 1250
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch   | Step | Validation Loss | Model Preparation Time | Wer     |
-|:-------------:|:-------:|:----:|:---------------:|:----------------------:|:-------:|
-| 0.6188        | 0.9804  | 50   | 0.3424          | 0.0044                 | 23.1107 |
-| 0.2417        | 1.9608  | 100  | 0.3089          | 0.0044                 | 21.0881 |
-| 0.1456        | 2.9412  | 150  | 0.3038          | 0.0044                 | 20.7412 |
-| 0.0928        | 3.9216  | 200  | 0.3111          | 0.0044                 | 20.1469 |
-| 0.0554        | 4.9020  | 250  | 0.3231          | 0.0044                 | 20.3739 |
-| 0.035         | 5.8824  | 300  | 0.3367          | 0.0044                 | 20.8432 |
-| 0.0215        | 6.8627  | 350  | 0.3619          | 0.0044                 | 20.8458 |
-| 0.0143        | 7.8431  | 400  | 0.3768          | 0.0044                 | 20.9299 |
-| 0.0101        | 8.8235  | 450  | 0.3880          | 0.0044                 | 20.8509 |
-| 0.0084        | 9.8039  | 500  | 0.3960          | 0.0044                 | 20.8993 |
-| 0.0072        | 10.7843 | 550  | 0.3999          | 0.0044                 | 21.0014 |
-| 0.0059        | 11.7647 | 600  | 0.4069          | 0.0044                 | 20.8942 |
-| 0.0053        | 12.7451 | 650  | 0.4130          | 0.0044                 | 20.9656 |
-| 0.0047        | 13.7255 | 700  | 0.4177          | 0.0044                 | 20.9963 |
-| 0.0043        | 14.7059 | 750  | 0.4208          | 0.0044                 | 20.9478 |
-| 0.004         | 15.6863 | 800  | 0.4241          | 0.0044                 | 21.0371 |
-| 0.0037        | 16.6667 | 850  | 0.4265          | 0.0044                 | 21.0600 |
-| 0.0035        | 17.6471 | 900  | 0.4298          | 0.0044                 | 21.1034 |
-| 0.0034        | 18.6275 | 950  | 0.4317          | 0.0044                 | 21.0983 |
-| 0.0032        | 19.6078 | 1000 | 0.4334          | 0.0044                 | 21.1416 |
-| 0.0031        | 20.5882 | 1050 | 0.4351          | 0.0044                 | 21.1518 |
-| 0.003         | 21.5686 | 1100 | 0.4361          | 0.0044                 | 21.1748 |
-| 0.0029        | 22.5490 | 1150 | 0.4368          | 0.0044                 | 21.1620 |
-| 0.0029        | 23.5294 | 1200 | 0.4374          | 0.0044                 | 21.1799 |
-| 0.0029        | 24.5098 | 1250 | 0.4377          | 0.0044                 | 21.1722 |
-### Framework versions
-- Transformers 4.49.0
-- Pytorch 2.6.0+cu124
-- Datasets 3.3.2
-- Tokenizers 0.21.0

 ---
 base_model: openai/whisper-small
 datasets:
+- artifacts/sv-SE_mozilla-foundation_common_voice_17_0
+library_name: transformers
+license: apache-2.0
 model-index:
+- name: Finetuned openai/whisper-small on Swedish
   results:
   - task:
       type: automatic-speech-recognition
+      name: Speech-to-Text
     dataset:
+      name: Common Voice (Swedish)
+      type: common_voice
     metrics:
+    - type: wer
+      value: 20.147
 ---
+# Finetuned openai/whisper-small on 12954 Swedish training audio samples from artifacts/sv-SE_mozilla-foundation_common_voice_17_0.
+This model was created from the Mozilla.ai Blueprint:
+[speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).
+## Evaluation results on 5259 audio samples of Swedish:
+### Baseline model (before finetuning) on Swedish
+- Word Error Rate: 28.413
+- Loss: 1.066
+### Finetuned model (after finetuning) on Swedish
+- Word Error Rate: 20.147
+- Loss: 0.311