whisper-large-v3-german-ct2

CTranslate2 (bfloat16) conversion of primeline/whisper-large-v3-german for use with faster-whisper.

Original model

Author: Florian Zimmermeister / primeLine AI Services
Base model: openai/whisper-large-v3, fine-tuned on German speech data
WER: 3.0% on Common Voice DE
License: Apache 2.0

Usage

from faster_whisper import WhisperModel

model = WhisperModel("tnfru/whisper-large-v3-german-ct2", device="cuda", compute_type="bfloat16")
segments, info = model.transcribe(
    "audio.wav",
    language="de",
    condition_on_previous_text=False,
)

for segment in segments:
    print(segment.text)

Known issues with faster-whisper

Passing initial_prompt or hotwords to transcribe() causes this model to produce zero segments — even an empty string "" breaks it. Use None (the default) for both. This is likely because the model was not trained with initial_prompt (see faster-whisper#590 for background).

Additionally, condition_on_previous_text must be set to False. The default (True) causes Whisper to feed each segment's transcription as context into the next 30s segment — the same mechanism as initial_prompt — which truncates output after the first segment.

Parameter	Safe value	Breaks with
`initial_prompt`	`None` (default)	Any string, including `""`
`hotwords`	`None` (default)	Any string, including `""`
`condition_on_previous_text`	`False`	`True` (default) — truncates after ~30s

Why bfloat16?

The original model was trained in bfloat16. Converting to float16 causes precision loss that manifests as silent truncation in multi-segment transcription (see faster-whisper#567). Always use --quantization bfloat16 when converting.

Conversion

ct2-transformers-converter \
  --model primeline/whisper-large-v3-german \
  --output_dir whisper-large-v3-german-ct2 \
  --copy_files tokenizer.json preprocessor_config.json \
  --quantization bfloat16

Downloads last month: 117

Model tree for tnfru/whisper-large-v3-german-ct2

Base model

primeline/whisper-large-v3-german

Finetuned

(10)

this model