File size: 6,671 Bytes
ae5e3a4 1532690 ae5e3a4 e05980b 7d79a54 3622b8d e05980b 1185bcf 8a00f8a b67f49c 1a648dc b67f49c 1a648dc b67f49c 1a648dc b67f49c 1a648dc b67f49c 1a648dc b67f49c e05980b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | ---
license: openrail
language:
- tr
- bg
- el
- fr
- ko
- ja
- ar
- de
- da
- cs
- es
- et
- fi
- hr
- hu
- id
- it
- lt
- lv
- nl
- pl
- pt
- ro
- ru
- sk
- sl
- sv
- uk
- vi
- na
tags:
- text
- speech
- tts
- neuvoice
- turkish
- turkey
---
# Neuvoice — Fast, On-Device Neural TTS by Lamapi
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/QRimUEnTq6ESADGnHCH87.wav"></audio>
**Neuvoice** is a lightweight, on-device text-to-speech library that runs entirely via ONNX Runtime — no cloud calls, no latency surprises. (supertonic-3) Modified by [Lamapi](https://huggingface.co/thelamapi), it ships with a curated voice library and supports 31 languages out of the box.

---
## Examples (Using 'Linda' Voice)
### Turkish
`<happy> Bugün kendimi mutlu hissediyorum!`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/NYK8ukN9ywI7MAa86Chvp.wav"></audio>
### English
`<happy> I feel happy today!`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/udPqsDRowN_nV42l37sM7.wav"></audio>
### French
`<happy> Je me sens heureux aujourd'hui!`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/G71ItTvWBa5Ygu6lA9z72.wav"></audio>
### Japanese
`<happy> こんにちは!今日はいい天気ですね。`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/dIj08g1OXaq-p1dkRrLvI.wav"></audio>
---
## Quick Start
```bash
pip install neuvoice
```
```python
from neuvoice import TTS
tts = TTS(auto_download=True)
style = tts.get_voice_style("Alina")
text = "Hello! Welcome to Neuvoice — fast, private, on-device speech synthesis."
wav, duration = tts.synthesize(text, voice_style=style, lang="en")
tts.save_audio(wav, "output.wav")
print(f"Generated {duration:.2f}s of audio")
```
On first run, model assets are downloaded and cached automatically under `~/.cache/neuvoice`.
## Highlights
**31 supported languages.** Covers a wide range of scripts and regions, from European languages to Arabic, Hindi, Japanese, Korean, Vietnamese, and more.
**Runs entirely on-device.** ONNX Runtime powers inference — no API keys, no network dependency after the initial model download. CPU is sufficient; GPU acceleration is supported when available.
**Rich voice library.** Ships with voices including Alina, Cem, Cole, Giray, Leon, Lina, Linda, Mustafa, Sarp, Selin, Sema, and Soras. Each voice is a compact style embedding — load any of them by name in a single call.
**Inline expression tags.** Embed `<happy>`, `<laugh>`, `<breath>`, `<sad>`, and other tags directly in your text to shape the delivery without any extra parameters.
**Long-form synthesis.** Inputs are automatically chunked, synthesized, and rejoined with configurable silence — no manual splitting required.
## Supported Languages
| Code | Language | Code | Language | Code | Language | Code | Language |
|------|----------|------|----------|------|----------|------|----------|
| `en` | English | `ko` | Korean | `ja` | Japanese | `ar` | Arabic |
| `bg` | Bulgarian | `cs` | Czech | `da` | Danish | `de` | German |
| `el` | Greek | `es` | Spanish | `et` | Estonian | `fi` | Finnish |
| `fr` | French | `hi` | Hindi | `hr` | Croatian | `hu` | Hungarian |
| `id` | Indonesian | `it` | Italian | `lt` | Lithuanian | `lv` | Latvian |
| `nl` | Dutch | `pl` | Polish | `pt` | Portuguese | `ro` | Romanian |
| `ru` | Russian | `sk` | Slovak | `sl` | Slovenian | `sv` | Swedish |
| `tr` | Turkish | `uk` | Ukrainian | `vi` | Vietnamese | `na` | (fallback) |
## Available Voices
```python
tts = TTS()
print(tts.voice_style_names)
# ['Alina', 'Cem', 'Cole', 'Giray', 'Leon', 'Lina', 'Linda',
# 'Mustafa', 'Sarp', 'Selin', 'Sema', 'Soras']
```
Load any voice by name:
```python
style = tts.get_voice_style("Selin")
```
Or from a custom style file:
```python
style = tts.get_voice_style_from_path("/path/to/my_voice.json")
```
## `synthesize()` Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `text` | `str` | — | Input text. Supports inline tags like `<happy>`. |
| `voice_style` | `VoiceStyle` | — | Voice loaded via `get_voice_style()`. |
| `total_steps` | `int` | `5` | Flow-matching denoising steps. Higher = better quality, slower inference. Range: 1–100. |
| `speed` | `float` | `1.05` | Playback speed multiplier. Range: 0.7–2.0. |
| `lang` | `str` | `"en"` | ISO 639-1 language code, or `"na"` for unknown languages. |
| `max_chunk_length` | `int` | `300` | Max characters per synthesis chunk (120 for Korean). |
| `silence_duration` | `float` | `0.3` | Seconds of silence inserted between chunks. |
| `verbose` | `bool` | `False` | Print per-chunk progress to stdout. |
**Returns:** `(waveform, duration)` — waveform as a `(1, samples)` NumPy array, duration in seconds.
## Examples
**Multilingual synthesis:**
```python
tts = TTS()
style = tts.get_voice_style("Leon")
pairs = [
("Merhaba! Bugün hava çok güzel.", "tr"),
("Bonjour! Il fait beau aujourd'hui.", "fr"),
("こんにちは!今日はいい天気ですね。", "ja"),
]
for text, lang in pairs:
wav, dur = tts.synthesize(text, voice_style=style, lang=lang)
tts.save_audio(wav, f"output_{lang}.wav")
```
**Expression tags:**
```python
style = tts.get_voice_style("Lina")
text = "Good news! <happy> We just shipped the feature. <laugh> Don't tell anyone yet."
wav, dur = tts.synthesize(text, voice_style=style, lang="en")
```
**Higher quality with more steps:**
```python
wav, dur = tts.synthesize(
text="A slow, deliberate reading for maximum clarity.",
voice_style=style,
total_steps=30,
speed=0.9,
lang="en",
)
```
## Configuration
Model cache location and thread counts can be controlled via environment variables:
| Variable | Description |
|----------|-------------|
| `NEUVOICE_CACHE_DIR` | Override the default cache directory (`~/.cache/neuvoice`). |
| `NEUVOICE_MODEL_REPO` | Override the Hugging Face model repository. |
| `NEUVOICE_REVISION` | Model revision/branch to use (default: `main`). |
| `NEUVOICE_INTRA_THREADS` | ONNX intra-op thread count (default: auto). |
| `NEUVOICE_INTER_THREADS` | ONNX inter-op thread count (default: auto). |
## License
Neuvoice is released under the MIT License. See [LICENSE](LICENSE) for details.
The bundled ONNX model is released under the OpenRAIL-M License.
Copyright © 2026 Lamapi |