whisper-large-v3-german-ct2
CTranslate2 (bfloat16) conversion of primeline/whisper-large-v3-german for use with faster-whisper.
Original model
- Author: Florian Zimmermeister / primeLine AI Services
- Base model: openai/whisper-large-v3, fine-tuned on German speech data
- WER: 3.0% on Common Voice DE
- License: Apache 2.0
Usage
from faster_whisper import WhisperModel
model = WhisperModel("tnfru/whisper-large-v3-german-ct2", device="cuda", compute_type="bfloat16")
segments, info = model.transcribe(
"audio.wav",
language="de",
condition_on_previous_text=False,
)
for segment in segments:
print(segment.text)
Known issues with faster-whisper
Passing initial_prompt or hotwords to transcribe() causes this model to produce zero segments — even an empty string "" breaks it. Use None (the default) for both. This is likely because the model was not trained with initial_prompt (see faster-whisper#590 for background).
Additionally, condition_on_previous_text must be set to False. The default (True) causes Whisper to feed each segment's transcription as context into the next 30s segment — the same mechanism as initial_prompt — which truncates output after the first segment.
| Parameter | Safe value | Breaks with |
|---|---|---|
initial_prompt |
None (default) |
Any string, including "" |
hotwords |
None (default) |
Any string, including "" |
condition_on_previous_text |
False |
True (default) — truncates after ~30s |
Why bfloat16?
The original model was trained in bfloat16. Converting to float16 causes precision loss that manifests as silent truncation in multi-segment transcription (see faster-whisper#567). Always use --quantization bfloat16 when converting.
Conversion
ct2-transformers-converter \
--model primeline/whisper-large-v3-german \
--output_dir whisper-large-v3-german-ct2 \
--copy_files tokenizer.json preprocessor_config.json \
--quantization bfloat16
- Downloads last month
- 117
Model tree for tnfru/whisper-large-v3-german-ct2
Base model
primeline/whisper-large-v3-german