metadata
license: apache-2.0
language:
- en
pipeline_tag: text-to-speech
DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Discrete Flow Matching
DiFlow-TTS is trained on 470 hours of the LibriTTS dataset, which consists of predominantly neutral speech. As a result, it may not perform well on prompts with strong emotional expression.
Download DiFlow-TTS checkpoint, and place it as follows:
root/
└── ckpts/
└── diflow-tts.ckpt