UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information
Paper
•
2505.17426
•
Published
Qwen2.5-7B-ExtVocab is an extended version of the Qwen2.5-7B model that incorporates additional audio tokens based on the DistilCodec codebook, specifically designed for training the UniTTS system. For technical details, please refer to the arXiv paper UniTTS and code.
@misc{wang2025unittsendtoendttsdecoupling,
title={UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information},
author={Rui Wang and Qianguo Sun and Tianrong Chen and Zhiyun Zeng and Junlong Wu and Jiaxing Zhang},
year={2025},
eprint={2505.17426},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2505.17426},
}
UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information © 2025 by Rui Wang, Qianguo Sun, Tianrong Chen, Zhiyun Zeng, Junlong Wu, Jiaxing Zhang is licensed under CC BY-NC-ND 4.0