whisper-large-v3-german-ct2

CTranslate2 (bfloat16) conversion of primeline/whisper-large-v3-german for use with faster-whisper.

Original model

  • Author: Florian Zimmermeister / primeLine AI Services
  • Base model: openai/whisper-large-v3, fine-tuned on German speech data
  • WER: 3.0% on Common Voice DE
  • License: Apache 2.0

Usage

from faster_whisper import WhisperModel

model = WhisperModel("tnfru/whisper-large-v3-german-ct2", device="cuda", compute_type="bfloat16")
segments, info = model.transcribe(
    "audio.wav",
    language="de",
    condition_on_previous_text=False,
)

for segment in segments:
    print(segment.text)

Known issues with faster-whisper

Passing initial_prompt or hotwords to transcribe() causes this model to produce zero segments — even an empty string "" breaks it. Use None (the default) for both. This is likely because the model was not trained with initial_prompt (see faster-whisper#590 for background).

Additionally, condition_on_previous_text must be set to False. The default (True) causes Whisper to feed each segment's transcription as context into the next 30s segment — the same mechanism as initial_prompt — which truncates output after the first segment.

Parameter Safe value Breaks with
initial_prompt None (default) Any string, including ""
hotwords None (default) Any string, including ""
condition_on_previous_text False True (default) — truncates after ~30s

Why bfloat16?

The original model was trained in bfloat16. Converting to float16 causes precision loss that manifests as silent truncation in multi-segment transcription (see faster-whisper#567). Always use --quantization bfloat16 when converting.

Conversion

ct2-transformers-converter \
  --model primeline/whisper-large-v3-german \
  --output_dir whisper-large-v3-german-ct2 \
  --copy_files tokenizer.json preprocessor_config.json \
  --quantization bfloat16
Downloads last month
117
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tnfru/whisper-large-v3-german-ct2

Finetuned
(10)
this model