Whisper Large V3 Turbo - Luxembourgish

A fine-tuned version of OpenAI's Whisper Large V3 Turbo optimized for Luxembourgish speech recognition.

Model Description

This model was fine-tuned on approximately 75 hours of Luxembourgish speech data, curated to be representative of diverse speaking contexts within the available data landscape for the language.

The model is provided in CTranslate2 format for efficient inference with faster-whisper and compatible libraries.

Model Details

Base Model: openai/whisper-large-v3-turbo
Language: Luxembourgish (lb)
Format: CTranslate2
License: MIT

Intended Uses

General automatic speech recognition (ASR) for Luxembourgish
Audio and video transcription
Speaker diarization pipelines (when combined with PyAnnote)

Usage

With faster-whisper

from faster_whisper import WhisperModel

model = WhisperModel("ZLSCompLing/WhisperLargeTurboV3_Luxembourgish", device="cuda")

segments, info = model.transcribe("audio.wav", language="lb")
for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

With WhisperX (for word-level timestamps and speaker diarization)

import whisperx

model = whisperx.load_model("ZLSCompLing/WhisperLargeTurboV3_Luxembourgish", device="cuda")
audio = whisperx.load_audio("audio.wav")
result = model.transcribe(audio, language="lb")

Technical Notes

When using this model with WhisperX and PyAnnote for speaker diarization, be aware of potential dependency conflicts between PyTorch, WhisperX, and PyAnnote. Depending on your setup, you may need both modern CUDA libraries and cuDNN 8 libraries installed.

For reference, see the Sproochmaschinn project, which uses this model in production.

Citation

If you use this model, please cite:

@misc{zls2025whisperlb,
  title={Whisper Large V3 Turbo - Luxembourgish},
  author={Zenter fir d'Lëtzebuerger Sprooch},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/ZLSCompLing/WhisperLargeTurboV3_Luxembourgish}
}

Acknowledgments

Developed by Zenter fir d'Lëtzebuerger Sprooch.

This model powers Sproochmaschinn, a Luxembourgish speech processing platform.

Downloads last month: 21

Model tree for ZLSCompLing/Whisper-Large-V3-Turbo-Luxembourgish

Base model

openai/whisper-large-v3

Finetuned

openai/whisper-large-v3-turbo

Finetuned

(536)

this model