File size: 2,502 Bytes
e82ce68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
# Generated at 2026-01-29T20:36:13Z from templates/weights/README.md.j2
license: other
language:
  - eng
tags:
  - tts
  - text-to-speech
  - speech-synthesis
  - voice-cloning
library_name: ttsdb
pipeline_tag: text-to-speech
base_model:
  - jbetker/tortoise-tts-v2

---

# TorToise

> **This is a mirror of the original weights for use with [TTSDB](https://github.com/ttsds/ttsdb).**
> 
> Original weights: [https://huggingface.co/jbetker/tortoise-tts-v2](https://huggingface.co/jbetker/tortoise-tts-v2)
> Original code: [https://github.com/neonbjb/tortoise-tts.git](https://github.com/neonbjb/tortoise-tts.git)


Tortoise TTS voice cloning model.



## Original Work

This model was created by the original authors. Please cite their work if you use this model:


```bibtex
@misc{betker2023betterspeechsynthesisscaling,
  title={Better speech synthesis through scaling},
  author={James Betker},
  year={2023},
  eprint={2305.07243},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2305.07243},
}
```



**Papers:**

- https://arxiv.org/abs/2305.07243



## Installation

```bash
pip install ttsdb-tortoise
```

## Usage

```python
from ttsdb_tortoise import TorToise

# Load the model (downloads weights automatically)
model = TorToise(model_id="ttsds/TorToise")

# Synthesize speech
audio, sample_rate = model.synthesize(
    text="Hello, this is a test of TorToise.",
    reference_audio="path/to/reference.wav",
    text_reference="Transcript of the reference audio.",
    language="en",
)

# Save the output
model.save_audio(audio, sample_rate, "output.wav")
```

## Model Details

| Property | Value |
|----------|-------|
| **Sample Rate** | 24000 Hz |
| **Parameters** | 960M |
| **Architecture** | Autoregressive, Diffusion, Language Modeling |
| **Languages** | English |
| **Release Date** | 2022-05-17 |


### Training Data


- [LibriTTS](https://www.openslr.org/60/)


- [HifiTTS]()




## License

- **Weights:** Other (see original repository)
- **Code:** Apache License 2.0

Please refer to the original repositories for full license terms.

## Links

- **Original Code:** [https://github.com/neonbjb/tortoise-tts.git](https://github.com/neonbjb/tortoise-tts.git)
- **Original Weights:** [https://huggingface.co/jbetker/tortoise-tts-v2](https://huggingface.co/jbetker/tortoise-tts-v2)
- **TTSDB Package:** [ttsdb-tortoise](https://pypi.org/project/ttsdb-tortoise/)
- **TTSDB GitHub:** [https://github.com/ttsds/ttsdb](https://github.com/ttsds/ttsdb)