tortoise / README.md
cdminix's picture
Update tortoise space
c4b1cb5 verified
metadata
title: TorToise
emoji: 🎤
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: other
preload_from_hub:
  - ttsds/tortoise

TorToise Text-to-Speech

Tortoise TTS voice cloning model.

Features

  • Zero-shot voice cloning
  • Multiple language support: English
  • High-quality 24kHz audio output

Usage

  1. Upload a reference audio clip (3-10 seconds recommended)
  2. Enter the transcript of the reference audio
  3. Enter the text you want to synthesize
  4. Select the language
  5. Click "Synthesize"

Model Information

  • Architecture: Autoregressive, Diffusion, Language Modeling
  • Sample Rate: 24000 Hz
  • Parameters: 960M

Citation

@misc{betker2023betterspeechsynthesisscaling,
  title={Better speech synthesis through scaling},
  author={James Betker},
  year={2023},
  eprint={2305.07243},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2305.07243},
}

Links