Spaces:

ttsds
/

e2-tts

Sleeping

App Files Files Community

e2-tts / README.md

cdminix

Update e2-tts space

10bd3d5 verified 3 months ago

preview code

raw

history blame contribute delete

1.41 kB

metadata

title: E2 TTS
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: cc-by-nc-4.0

E2 TTS Text-to-Speech

A non-autoregressive masked U-Net transformer text-to-speech model.

Features

Zero-shot voice cloning
Multiple language support: English, Chinese
High-quality 24kHz audio output

Usage

Upload a reference audio clip (3-10 seconds recommended)
Enter the transcript of the reference audio
Enter the text you want to synthesize
Select the language
Click "Synthesize"

Model Information

Architecture: Non-Autoregressive, Masked, Flow Matching, U-Net Transformer
Sample Rate: 24000 Hz
Parameters: 335M

Citation

@inproceedings{e2-tts,
  title={{E2 TTS}: Embarrassingly easy fully non-autoregressive zero-shot tts},
  author={Eskimez, Sefik Emre and Wang, Xiaofei and Thakker, Manthan and Li, Canrun and Tsai, Chung-Hsien and Xiao, Zhen and Yang, Hemin and Zhu, Zirun and Tang, Min and Tan, Xu and others},
  booktitle={2024 IEEE Spoken Language Technology Workshop (SLT)},
  pages={682--689},
  year={2024},
  organization={IEEE}
}

E2 TTS Text-to-Speech

Features

Usage

Model Information

Citation

Links