metadata
license: other
language:
- eng
tags:
- tts
- text-to-speech
- speech-synthesis
- voice-cloning
library_name: ttsdb
pipeline_tag: text-to-speech
base_model:
- jbetker/tortoise-tts-v2
TorToise
This is a mirror of the original weights for use with TTSDB.
Original weights: https://huggingface.co/jbetker/tortoise-tts-v2 Original code: https://github.com/neonbjb/tortoise-tts.git
Tortoise TTS voice cloning model.
Original Work
This model was created by the original authors. Please cite their work if you use this model:
@misc{betker2023betterspeechsynthesisscaling,
title={Better speech synthesis through scaling},
author={James Betker},
year={2023},
eprint={2305.07243},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2305.07243},
}
Papers:
Installation
pip install ttsdb-tortoise
Usage
from ttsdb_tortoise import TorToise
# Load the model (downloads weights automatically)
model = TorToise(model_id="ttsds/TorToise")
# Synthesize speech
audio, sample_rate = model.synthesize(
text="Hello, this is a test of TorToise.",
reference_audio="path/to/reference.wav",
text_reference="Transcript of the reference audio.",
language="en",
)
# Save the output
model.save_audio(audio, sample_rate, "output.wav")
Model Details
| Property | Value |
|---|---|
| Sample Rate | 24000 Hz |
| Parameters | 960M |
| Architecture | Autoregressive, Diffusion, Language Modeling |
| Languages | English |
| Release Date | 2022-05-17 |
Training Data
License
- Weights: Other (see original repository)
- Code: Apache License 2.0
Please refer to the original repositories for full license terms.
Links
- Original Code: https://github.com/neonbjb/tortoise-tts.git
- Original Weights: https://huggingface.co/jbetker/tortoise-tts-v2
- TTSDB Package: ttsdb-tortoise
- TTSDB GitHub: https://github.com/ttsds/ttsdb