File size: 2,621 Bytes
a1c77c8
 
 
 
 
 
 
 
28dcaf3
 
 
 
 
 
 
a1c77c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
license: apache-2.0
language:
- en
base_model:
- hexgrad/Kokoro-82M
pipeline_tag: text-to-speech
---
<div align="center">
<img src="https://huggingface.co/datasets/Quantamhash/Assets/resolve/main/images/dark_logo.png"
     alt="Title card" 
     style="width: 500px;
            height: auto;
            object-position: center top;">
</div>
**Qhash-TTS** is an open-weight TTS model with 84 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Qhash-TTS can be deployed anywhere from production environments to personal projects.

<audio controls><source src="https://huggingface.co/Quantamhash/Qhash-TTS/resolve/main/samples/HEARME.wav" type="audio/wav"></audio>


### Releases

| Model | Published | Training Data | Langs & Voices | SHA256 |
| ----- | --------- | ------------- | -------------- | ------ |
| **v1.0** | **2025 Jan 27** | **Few hundred hrs** | [**8 & 54**](https://huggingface.co/Quantamhash/Qhash-TTS/blob/main/VOICES.md) | `496dba11` |
| [v0.19] | 2024 Dec 25 | <100 hrs | 1 & 10 | `3b0c392f` |

| Training Costs | v0.19 | v1.0 | **Total** |
| -------------- | ----- | ---- | ----- |
| in A100 80GB GPU hours | 500 | 500 | **1000** |
| average hourly rate | $0.80/h | $1.20/h | **$1/h** |
| in USD | $400 | $600 | **$1000** |

### Usage
You can run this basic cell on [Google Colab](https://colab.research.google.com/). [Listen to samples](https://huggingface.co/Quantamhash/Qhash-TTS/blob/main/SAMPLES.md). For more languages and details, see [Advanced Usage](https://github.com/hexgrad/kokoro?tab=readme-ov-file#advanced-usage).
```py
!pip install -q kokoro>=0.9.2 soundfile
!apt-get -qq -y install espeak-ng > /dev/null 2>&1
from kokoro import KPipeline
from IPython.display import display, Audio
import soundfile as sf
import torch
pipeline = KPipeline(lang_code='a')
text = '''
 Qhash is an open-weight TTS model with 84 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Qhash-TTS can be deployed anywhere from production environments to personal projects.
'''
generator = pipeline(text, voice='af_heart')
for i, (gs, ps, audio) in enumerate(generator):
    print(i, gs, ps)
    display(Audio(data=audio, rate=24000, autoplay=i==0))
    sf.write(f'{i}.wav', audio, 24000)
```
Under the hood, `Qhash-TTS` uses [`misaki`](https://pypi.org/project/misaki/), a G2P library at https://github.com/hexgrad/misaki