Breeze Taigi: Benchmarks and Models for Taiwanese Hokkien Speech Recognition and Synthesis
Paper • 2603.19259 • Published • 2
This is a faster-whisper compatible conversion of MediaTek-Research/Breeze-ASR-26, converted to CTranslate2 format with int8 quantization.
BreezeASR-Taigi is a Taiwanese Hokkien (Taigi / 台語) automatic speech recognition (ASR) model developed as part of the Breeze Taigi framework — a comprehensive framework centered on standardized benchmarks and evaluation methodologies for Taiwanese Hokkien speech technologies.
@misc{lan2026breezetaigibenchmarksmodels,
title={Breeze Taigi: Benchmarks and Models for Taiwanese Hokkien Speech Recognition and Synthesis},
author={Yu-Siang Lan and Chia-Sheng Liu and Yi-Chang Chen and Po-Chun Hsu and Allyson Chiu and Shun-Wen Lin and Da-shan Shiu and Yuan-Fu Liao},
year={2026},
eprint={2603.19259},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.19259},
}
Apache 2.0, same as the original model.
Base model
openai/whisper-large-v2