File size: 1,212 Bytes
9b93ac8 c4b1cb5 ca200bf 9b93ac8 ca200bf c4b1cb5 9b93ac8 ca200bf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
# Generated at 2026-01-29T20:46:31Z from templates/space/README.md.j2
title: TorToise
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: other
preload_from_hub:
- ttsds/tortoise
---
# TorToise Text-to-Speech
Tortoise TTS voice cloning model.
## Features
- Zero-shot voice cloning
- Multiple language support: English
- High-quality 24kHz audio output
## Usage
1. Upload a reference audio clip (3-10 seconds recommended)
2. Enter the transcript of the reference audio
3. Enter the text you want to synthesize
4. Select the language
5. Click "Synthesize"
## Model Information
- **Architecture**: Autoregressive, Diffusion, Language Modeling
- **Sample Rate**: 24000 Hz
- **Parameters**: 960M
## Citation
```bibtex
@misc{betker2023betterspeechsynthesisscaling,
title={Better speech synthesis through scaling},
author={James Betker},
year={2023},
eprint={2305.07243},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2305.07243},
}
```
## Links
- [Model Weights](https://huggingface.co/ttsds/tortoise)
- [Code Repository](https://github.com/neonbjb/tortoise-tts.git)
- [Paper](https://arxiv.org/abs/2305.07243)
|