faster-whisper-Breeze-ASR-26

This is a faster-whisper compatible conversion of MediaTek-Research/Breeze-ASR-26, converted to CTranslate2 format with float16 quantization.

Model Description

BreezeASR-Taigi is a Taiwanese Hokkien (Taigi / 台語) automatic speech recognition (ASR) model developed as part of the Breeze Taigi framework. It is fine-tuned from openai/whisper-large-v2 on approximately 10,000 hours of large-scale synthetic Taiwanese Hokkien speech data. The model transcribes spoken Taigi audio and outputs Mandarin Chinese character transcriptions.

Conversion Details

Property Value
Source model MediaTek-Research/Breeze-ASR-26
Architecture Whisper Large V2
Quantization float16
Model size ~2.9 GB (vs ~6.2 GB float32 original)
CTranslate2 version 4.7.1

Usage

Run inference:

from faster_whisper import WhisperModel

model = WhisperModel("MediaTek-Research/Breeze-ASR-26-ct2", device="cuda", compute_type="float16")
# For CPU: WhisperModel("...", device="cpu", compute_type="int8")

segments, info = model.transcribe("audio.wav", language="zh", task="transcribe")

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

Note: This model outputs Mandarin Chinese characters (not Taigi orthography / 台語正字). Pass language="zh" explicitly to avoid language detection overhead.

Citation

@misc{lan2026breezetaigibenchmarksmodels,
      title={Breeze Taigi: Benchmarks and Models for Taiwanese Hokkien Speech Recognition and Synthesis},
      author={Yu-Siang Lan and Chia-Sheng Liu and Yi-Chang Chen and Po-Chun Hsu and Allyson Chiu and Shun-Wen Lin and Da-shan Shiu and Yuan-Fu Liao},
      year={2026},
      eprint={2603.19259},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.19259},
}

License

Apache 2.0, same as the original model.

Downloads last month
42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for paulpengtw/faster-whisper-Breeze-ASR-26

Finetuned
(3)
this model

Space using paulpengtw/faster-whisper-Breeze-ASR-26 1

Paper for paulpengtw/faster-whisper-Breeze-ASR-26