Quantumhash
/

Qhash-TTS

Model card Files Files and versions

sbapan41 commited on May 5

Commit

a1c77c8

·

verified ·

1 Parent(s): 5038428

Update README.md

Files changed (1) hide show

README.md +46 -3

README.md CHANGED Viewed

@@ -1,3 +1,46 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- hexgrad/Kokoro-82M
+pipeline_tag: text-to-speech
+---
+**Qhash-TTS** is an open-weight TTS model with 84 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Qhash-TTS can be deployed anywhere from production environments to personal projects.
+<audio controls><source src="https://huggingface.co/Quantamhash/Qhash-TTS/resolve/main/samples/HEARME.wav" type="audio/wav"></audio>
+### Releases
+| Model | Published | Training Data | Langs & Voices | SHA256 |
+| ----- | --------- | ------------- | -------------- | ------ |
+| **v1.0** | **2025 Jan 27** | **Few hundred hrs** | [**8 & 54**](https://huggingface.co/Quantamhash/Qhash-TTS/blob/main/VOICES.md) | `496dba11` |
+| [v0.19] | 2024 Dec 25 | <100 hrs | 1 & 10 | `3b0c392f` |
+| Training Costs | v0.19 | v1.0 | **Total** |
+| -------------- | ----- | ---- | ----- |
+| in A100 80GB GPU hours | 500 | 500 | **1000** |
+| average hourly rate | $0.80/h | $1.20/h | **$1/h** |
+| in USD | $400 | $600 | **$1000** |
+### Usage
+You can run this basic cell on [Google Colab](https://colab.research.google.com/). [Listen to samples](https://huggingface.co/Quantamhash/Qhash-TTS/blob/main/SAMPLES.md). For more languages and details, see [Advanced Usage](https://github.com/hexgrad/kokoro?tab=readme-ov-file#advanced-usage).
+```py
+!pip install -q kokoro>=0.9.2 soundfile
+!apt-get -qq -y install espeak-ng > /dev/null 2>&1
+from kokoro import KPipeline
+from IPython.display import display, Audio
+import soundfile as sf
+import torch
+pipeline = KPipeline(lang_code='a')
+text = '''
+ Qhash is an open-weight TTS model with 84 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Qhash-TTS can be deployed anywhere from production environments to personal projects.
+'''
+generator = pipeline(text, voice='af_heart')
+for i, (gs, ps, audio) in enumerate(generator):
+    print(i, gs, ps)
+    display(Audio(data=audio, rate=24000, autoplay=i==0))
+    sf.write(f'{i}.wav', audio, 24000)
+```
+Under the hood, `Qhash-TTS` uses [`misaki`](https://pypi.org/project/misaki/), a G2P library at https://github.com/hexgrad/misaki