Whisfusion: Parallel ASR Decoding via a Diffusion Transformer
Paper
•
2508.07048
•
Published
•
1
Diffusion Transformer ASR model combining Whisper encoder with diffusion decoder.
Paper: arXiv:2508.07048
Code: GitHub
whisfusion_stage2_decoder.pt: Full model (Stage 2 - encoder + decoder + adapter)whisfusion_stage1_adapter.pt: Adapter-only checkpoint (Stage 1)@article{kwon2025whisfusion,
title={Whisfusion: Parallel ASR Decoding via a Diffusion Transformer},
author={Kwon, Taeyoun and Ahn, Junhyuk and Yun, Taegeun and Jwa, Heeju and Choi, Yoonchae and Park, Siwon and Kim, Nam-Joon and Kim, Jangchan and Ryu, Hyun Gon and Lee, Hyuk-Jae},
journal={arXiv preprint arXiv:2508.07048},
year={2025}
}
Apache 2.0