Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges
Paper
•
2402.01917
•
Published
This model was converted to MLX format from NbAiLab/nb-whisper-large.
Refer to the original model card for detailed information about training data, performance benchmarks, and intended use.
NB-Whisper Large is a Norwegian speech recognition model developed by the National Library of Norway AI Lab (NB AI-Lab). It was fine-tuned from openai/whisper-large-v3 on 66,000 hours of Norwegian speech data (8 million aligned audio clips of 30 seconds each, 250,000 training steps). It supports Norwegian Bokmal, Nynorsk, and English.
Performance (from the paper):
| Dataset | OpenAI Large-v3 WER | NB-Whisper Large WER |
|---|---|---|
| Fleurs (Bokmal) | 10.4% | 6.6% |
| NST (Bokmal) | 6.8% | 2.2% |
See the paper for additional benchmarks including Nynorsk and dialect evaluation.
pip install mlx-whisper
import mlx_whisper
result = mlx_whisper.transcribe(
"audio.mp3",
path_or_hf_repo="aalst/nb-whisper-large-mlx",
language="no",
)
print(result["text"])
This model inherits the Apache 2.0 license from the original NB-Whisper model.
@misc{kummervold2024whisperingnorwegian,
title={Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges},
author={Per Egil Kummervold and Javier de la Rosa and Freddy Wetjen and Rolv-Arild Braaten and Per Erik Solberg},
year={2024},
eprint={2402.01917},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Quantized