Whisper Large V3 Turbo - Luxembourgish

A fine-tuned version of OpenAI's Whisper Large V3 Turbo optimized for Luxembourgish speech recognition.

Model Description

This model was fine-tuned on approximately 75 hours of Luxembourgish speech data, curated to be representative of diverse speaking contexts within the available data landscape for the language.

The model is provided in CTranslate2 format for efficient inference with faster-whisper and compatible libraries.

Model Details

  • Base Model: openai/whisper-large-v3-turbo
  • Language: Luxembourgish (lb)
  • Format: CTranslate2
  • License: MIT

Intended Uses

  • General automatic speech recognition (ASR) for Luxembourgish
  • Audio and video transcription
  • Speaker diarization pipelines (when combined with PyAnnote)

Usage

With faster-whisper

from faster_whisper import WhisperModel

model = WhisperModel("ZLSCompLing/WhisperLargeTurboV3_Luxembourgish", device="cuda")

segments, info = model.transcribe("audio.wav", language="lb")
for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

With WhisperX (for word-level timestamps and speaker diarization)

import whisperx

model = whisperx.load_model("ZLSCompLing/WhisperLargeTurboV3_Luxembourgish", device="cuda")
audio = whisperx.load_audio("audio.wav")
result = model.transcribe(audio, language="lb")

Technical Notes

When using this model with WhisperX and PyAnnote for speaker diarization, be aware of potential dependency conflicts between PyTorch, WhisperX, and PyAnnote. Depending on your setup, you may need both modern CUDA libraries and cuDNN 8 libraries installed.

For reference, see the Sproochmaschinn project, which uses this model in production.

Citation

If you use this model, please cite:

@misc{zls2025whisperlb,
  title={Whisper Large V3 Turbo - Luxembourgish},
  author={Zenter fir d'Lëtzebuerger Sprooch},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/ZLSCompLing/WhisperLargeTurboV3_Luxembourgish}
}

Acknowledgments

Developed by Zenter fir d'Lëtzebuerger Sprooch.

This model powers Sproochmaschinn, a Luxembourgish speech processing platform.

Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ZLSCompLing/Whisper-Large-V3-Turbo-Luxembourgish

Finetuned
(451)
this model