Text-to-Speech
English

DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Discrete Flow Matching

GitHub Paper Demo Interspeech 2026

DiFlow-TTS is trained on 470 hours of the LibriTTS dataset, which consists of predominantly neutral speech. As a result, it may not perform well on prompts with strong emotional expression.

Download DiFlow-TTS checkpoint, and place it as follows:

root/
โ””โ”€โ”€ ckpts/
    โ””โ”€โ”€ diflow-tts.ckpt
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for Fsoft-AIC/DiFlowTTS