sbapan41 commited on
Commit
a1c77c8
·
verified ·
1 Parent(s): 5038428

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -3
README.md CHANGED
@@ -1,3 +1,46 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - hexgrad/Kokoro-82M
7
+ pipeline_tag: text-to-speech
8
+ ---
9
+ **Qhash-TTS** is an open-weight TTS model with 84 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Qhash-TTS can be deployed anywhere from production environments to personal projects.
10
+
11
+ <audio controls><source src="https://huggingface.co/Quantamhash/Qhash-TTS/resolve/main/samples/HEARME.wav" type="audio/wav"></audio>
12
+
13
+
14
+ ### Releases
15
+
16
+ | Model | Published | Training Data | Langs & Voices | SHA256 |
17
+ | ----- | --------- | ------------- | -------------- | ------ |
18
+ | **v1.0** | **2025 Jan 27** | **Few hundred hrs** | [**8 & 54**](https://huggingface.co/Quantamhash/Qhash-TTS/blob/main/VOICES.md) | `496dba11` |
19
+ | [v0.19] | 2024 Dec 25 | <100 hrs | 1 & 10 | `3b0c392f` |
20
+
21
+ | Training Costs | v0.19 | v1.0 | **Total** |
22
+ | -------------- | ----- | ---- | ----- |
23
+ | in A100 80GB GPU hours | 500 | 500 | **1000** |
24
+ | average hourly rate | $0.80/h | $1.20/h | **$1/h** |
25
+ | in USD | $400 | $600 | **$1000** |
26
+
27
+ ### Usage
28
+ You can run this basic cell on [Google Colab](https://colab.research.google.com/). [Listen to samples](https://huggingface.co/Quantamhash/Qhash-TTS/blob/main/SAMPLES.md). For more languages and details, see [Advanced Usage](https://github.com/hexgrad/kokoro?tab=readme-ov-file#advanced-usage).
29
+ ```py
30
+ !pip install -q kokoro>=0.9.2 soundfile
31
+ !apt-get -qq -y install espeak-ng > /dev/null 2>&1
32
+ from kokoro import KPipeline
33
+ from IPython.display import display, Audio
34
+ import soundfile as sf
35
+ import torch
36
+ pipeline = KPipeline(lang_code='a')
37
+ text = '''
38
+ Qhash is an open-weight TTS model with 84 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Qhash-TTS can be deployed anywhere from production environments to personal projects.
39
+ '''
40
+ generator = pipeline(text, voice='af_heart')
41
+ for i, (gs, ps, audio) in enumerate(generator):
42
+ print(i, gs, ps)
43
+ display(Audio(data=audio, rate=24000, autoplay=i==0))
44
+ sf.write(f'{i}.wav', audio, 24000)
45
+ ```
46
+ Under the hood, `Qhash-TTS` uses [`misaki`](https://pypi.org/project/misaki/), a G2P library at https://github.com/hexgrad/misaki