--- # Generated at 2026-01-29T18:15:41Z from templates/weights/README.md.j2 license: cc-by-nc-4.0 language: - eng - zho tags: - tts - text-to-speech - speech-synthesis - voice-cloning library_name: ttsdb pipeline_tag: text-to-speech --- # E2 TTS > **This is a mirror of the original weights for use with [TTSDB](https://github.com/ttsds/ttsdb).** > > Original weights: [https://huggingface.co/SWivid/E2-TTS](https://huggingface.co/SWivid/E2-TTS) > Original code: [https://github.com/SWivid/F5-TTS](https://github.com/SWivid/F5-TTS) A non-autoregressive masked U-Net transformer text-to-speech model. ## Original Work This model was created by the original authors. Please cite their work if you use this model: ```bibtex @inproceedings{e2-tts, title={{E2 TTS}: Embarrassingly easy fully non-autoregressive zero-shot tts}, author={Eskimez, Sefik Emre and Wang, Xiaofei and Thakker, Manthan and Li, Canrun and Tsai, Chung-Hsien and Xiao, Zhen and Yang, Hemin and Zhu, Zirun and Tang, Min and Tan, Xu and others}, booktitle={2024 IEEE Spoken Language Technology Workshop (SLT)}, pages={682--689}, year={2024}, organization={IEEE} } ``` **Papers:** - https://ieeexplore.ieee.org/abstract/document/10832320 ## Installation ```bash pip install ttsdb-e2-tts ``` ## Usage ```python from ttsdb_e2_tts import E2TTS # Load the model (downloads weights automatically) model = E2TTS(model_id="ttsds/E2 TTS") # Synthesize speech audio, sample_rate = model.synthesize( text="Hello, this is a test of E2 TTS.", reference_audio="path/to/reference.wav", text_reference="Transcript of the reference audio.", language="en", ) # Save the output model.save_audio(audio, sample_rate, "output.wav") ``` ## Model Details | Property | Value | |----------|-------| | **Sample Rate** | 24000 Hz | | **Parameters** | 335M | | **Architecture** | Non-Autoregressive, Masked, Flow Matching, U-Net Transformer | | **Languages** | English, Chinese | | **Release Date** | 2024-10-30 | ### Training Data - [Emilia Dataset](https://huggingface.co/datasets/amphion/Emilia-Dataset) (100000 hours) ## License - **Weights:** Creative Commons Attribution-NonCommercial 4.0 - **Code:** MIT License Please refer to the original repositories for full license terms. ## Links - **Original Code:** [https://github.com/SWivid/F5-TTS](https://github.com/SWivid/F5-TTS) - **Original Weights:** [https://huggingface.co/SWivid/E2-TTS](https://huggingface.co/SWivid/E2-TTS) - **TTSDB Package:** [ttsdb-e2-tts](https://pypi.org/project/ttsdb-e2-tts/) - **TTSDB GitHub:** [https://github.com/ttsds/ttsdb](https://github.com/ttsds/ttsdb)