| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | --- |
| | ## ππ» IndexTTS2 ππ» |
| |
|
| | <center><h3>IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech</h3></center> |
| |
|
| |
|
| | <div align="center"> |
| | <a href='https://arxiv.org/abs/2506.21619'> |
| | <img src='https://img.shields.io/badge/ArXiv-2506.21619-red?logo=arxiv'/> |
| | </a> |
| | <br/> |
| | <a href='https://github.com/index-tts/index-tts'> |
| | <img src='https://img.shields.io/badge/GitHub-Code-orange?logo=github'/> |
| | </a> |
| | <a href='https://index-tts.github.io/index-tts2.github.io/'> |
| | <img src='https://img.shields.io/badge/GitHub-Demo-orange?logo=github'/> |
| | </a> |
| | <br/> |
| | <!--a href='https://huggingface.co/spaces/IndexTeam/IndexTTS'> |
| | <img src='https://img.shields.io/badge/HuggingFace-Demo-blue?logo=huggingface'/> |
| | </a--> |
| | <a href='https://huggingface.co/IndexTeam/IndexTTS-2'> |
| | <img src='https://img.shields.io/badge/HuggingFace-Model-blue?logo=huggingface' /> |
| | </a> |
| | <br/> |
| | <!--a href='https://modelscope.cn/studios/IndexTeam/IndexTTS-Demo'> |
| | <img src='https://img.shields.io/badge/ModelScope-Demo-purple?logo=modelscope'/> |
| | </a--> |
| | <a href='https://modelscope.cn/models/IndexTeam/IndexTTS-2'> |
| | <img src='https://img.shields.io/badge/ModelScope-Model-purple?logo=modelscope'/> |
| | </a> |
| | </div> |
| | |
| |
|
| | ## Acknowledge |
| | 1. [tortoise-tts](https://github.com/neonbjb/tortoise-tts) |
| | 2. [XTTSv2](https://github.com/coqui-ai/TTS) |
| | 3. [BigVGAN](https://github.com/NVIDIA/BigVGAN) |
| | 4. [wenet](https://github.com/wenet-e2e/wenet/tree/main) |
| | 5. [icefall](https://github.com/k2-fsa/icefall) |
| | 6. [maskgct](https://github.com/open-mmlab/Amphion/tree/main/models/tts/maskgct) |
| | 7. [seed-vc](https://github.com/Plachtaa/seed-vc) |
| |
|
| |
|
| | ## π Citation |
| |
|
| | π If you find our work helpful, please leave us a star and cite our paper. |
| |
|
| |
|
| | IndexTTS2 |
| | ``` |
| | @article{zhou2025indextts2, |
| | title={IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech}, |
| | author={Siyi Zhou, Yiquan Zhou, Yi He, Xun Zhou, Jinchao Wang, Wei Deng, Jingchen Shu}, |
| | journal={arXiv preprint arXiv:2506.21619}, |
| | year={2025} |
| | } |
| | ``` |
| |
|
| | IndexTTS |
| | ``` |
| | @article{deng2025indextts, |
| | title={IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System}, |
| | author={Wei Deng, Siyi Zhou, Jingchen Shu, Jinchao Wang, Lu Wang}, |
| | journal={arXiv preprint arXiv:2502.05512}, |
| | year={2025}, |
| | doi={10.48550/arXiv.2502.05512}, |
| | url={https://arxiv.org/abs/2502.05512} |
| | } |
| | ``` |
| |
|