|
|
--- |
|
|
title: XTTSv2 Optimized TTS |
|
|
emoji: 🐸 |
|
|
colorFrom: green |
|
|
colorTo: blue |
|
|
sdk: gradio |
|
|
sdk_version: 5.5.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: other |
|
|
tags: |
|
|
- tts |
|
|
- text-to-speech |
|
|
- voice-cloning |
|
|
- xtts |
|
|
- coqui |
|
|
suggested_hardware: t4-small |
|
|
--- |
|
|
|
|
|
# 🐸 XTTSv2 Optimized Text-to-Speech |
|
|
|
|
|
High-quality multilingual voice cloning powered by XTTSv2 with performance optimizations. |
|
|
|
|
|
## Features |
|
|
|
|
|
- **17 Languages**: English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Japanese, Hungarian, Korean, Hindi |
|
|
- **Voice Cloning**: Clone any voice from ~6 seconds of reference audio |
|
|
- **Streaming Mode**: Low-latency streaming for real-time applications |
|
|
- **Optimizations**: |
|
|
- DeepSpeed acceleration |
|
|
- FP16 inference |
|
|
- torch.compile() optimization |
|
|
- Speaker embedding caching |
|
|
|
|
|
## Usage |
|
|
|
|
|
1. Upload a reference audio file (WAV/MP3, 6-30 seconds recommended) |
|
|
2. Enter your text |
|
|
3. Select the language |
|
|
4. Click "Generate Speech" |
|
|
|
|
|
## Performance |
|
|
|
|
|
| Hardware | Latency (per sentence) | |
|
|
|----------|------------------------| |
|
|
| T4 | ~2-3 seconds | |
|
|
| A10G | ~1 second | |
|
|
| A100 | ~0.5 seconds | |
|
|
|
|
|
## Configuration |
|
|
|
|
|
Environment variables for tuning: |
|
|
|
|
|
- `USE_DEEPSPEED`: Enable DeepSpeed (default: true) |
|
|
- `USE_FP16`: Enable FP16 inference (default: true) |
|
|
- `USE_TORCH_COMPILE`: Enable torch.compile (default: true) |
|
|
- `MAX_CACHE_SIZE`: Number of speakers to cache (default: 10) |
|
|
- `STREAMING_CHUNK_SIZE`: Streaming chunk size (default: 20) |
|
|
|
|
|
## License |
|
|
|
|
|
This model uses the [Coqui Public Model License](https://coqui.ai/cpml). |
|
|
|
|
|
## Credits |
|
|
|
|
|
- [Coqui TTS](https://github.com/coqui-ai/TTS) |
|
|
- [XTTS Paper](https://arxiv.org/abs/2406.04904) |
|
|
|