File size: 6,671 Bytes
ae5e3a4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1532690
 
ae5e3a4
e05980b
7d79a54
3622b8d
e05980b
1185bcf
8a00f8a
b67f49c
 
1a648dc
b67f49c
 
 
1a648dc
b67f49c
 
 
1a648dc
b67f49c
 
 
1a648dc
b67f49c
 
 
1a648dc
b67f49c
 
 
e05980b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
---
license: openrail
language:
- tr
- bg
- el
- fr
- ko
- ja
- ar
- de
- da
- cs
- es
- et
- fi
- hr
- hu
- id
- it
- lt
- lv
- nl
- pl
- pt
- ro
- ru
- sk
- sl
- sv
- uk
- vi
- na
tags:
- text
- speech
- tts
- neuvoice
- turkish
- turkey
---
# Neuvoice — Fast, On-Device Neural TTS by Lamapi
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/QRimUEnTq6ESADGnHCH87.wav"></audio>
**Neuvoice** is a lightweight, on-device text-to-speech library that runs entirely via ONNX Runtime — no cloud calls, no latency surprises. (supertonic-3) Modified by [Lamapi](https://huggingface.co/thelamapi), it ships with a curated voice library and supports 31 languages out of the box.


![IMAGE](https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/kyNY0a11CFQ0VWAGRevpP.png)
---

## Examples (Using 'Linda' Voice)

### Turkish
`<happy> Bugün kendimi mutlu hissediyorum!`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/NYK8ukN9ywI7MAa86Chvp.wav"></audio>

### English
`<happy> I feel happy today!`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/udPqsDRowN_nV42l37sM7.wav"></audio>

### French
`<happy> Je me sens heureux aujourd'hui!`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/G71ItTvWBa5Ygu6lA9z72.wav"></audio>

### Japanese
`<happy> こんにちは!今日はいい天気ですね。`
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/dIj08g1OXaq-p1dkRrLvI.wav"></audio>

---

## Quick Start

```bash
pip install neuvoice
```

```python
from neuvoice import TTS

tts = TTS(auto_download=True)
style = tts.get_voice_style("Alina")

text = "Hello! Welcome to Neuvoice — fast, private, on-device speech synthesis."
wav, duration = tts.synthesize(text, voice_style=style, lang="en")

tts.save_audio(wav, "output.wav")
print(f"Generated {duration:.2f}s of audio")
```

On first run, model assets are downloaded and cached automatically under `~/.cache/neuvoice`.

## Highlights

**31 supported languages.** Covers a wide range of scripts and regions, from European languages to Arabic, Hindi, Japanese, Korean, Vietnamese, and more.

**Runs entirely on-device.** ONNX Runtime powers inference — no API keys, no network dependency after the initial model download. CPU is sufficient; GPU acceleration is supported when available.

**Rich voice library.** Ships with voices including Alina, Cem, Cole, Giray, Leon, Lina, Linda, Mustafa, Sarp, Selin, Sema, and Soras. Each voice is a compact style embedding — load any of them by name in a single call.

**Inline expression tags.** Embed `<happy>`, `<laugh>`, `<breath>`, `<sad>`, and other tags directly in your text to shape the delivery without any extra parameters.

**Long-form synthesis.** Inputs are automatically chunked, synthesized, and rejoined with configurable silence — no manual splitting required.

## Supported Languages

| Code | Language | Code | Language | Code | Language | Code | Language |
|------|----------|------|----------|------|----------|------|----------|
| `en` | English | `ko` | Korean | `ja` | Japanese | `ar` | Arabic |
| `bg` | Bulgarian | `cs` | Czech | `da` | Danish | `de` | German |
| `el` | Greek | `es` | Spanish | `et` | Estonian | `fi` | Finnish |
| `fr` | French | `hi` | Hindi | `hr` | Croatian | `hu` | Hungarian |
| `id` | Indonesian | `it` | Italian | `lt` | Lithuanian | `lv` | Latvian |
| `nl` | Dutch | `pl` | Polish | `pt` | Portuguese | `ro` | Romanian |
| `ru` | Russian | `sk` | Slovak | `sl` | Slovenian | `sv` | Swedish |
| `tr` | Turkish | `uk` | Ukrainian | `vi` | Vietnamese | `na` | (fallback) |

## Available Voices

```python
tts = TTS()
print(tts.voice_style_names)
# ['Alina', 'Cem', 'Cole', 'Giray', 'Leon', 'Lina', 'Linda',
#  'Mustafa', 'Sarp', 'Selin', 'Sema', 'Soras']
```

Load any voice by name:

```python
style = tts.get_voice_style("Selin")
```

Or from a custom style file:

```python
style = tts.get_voice_style_from_path("/path/to/my_voice.json")
```

## `synthesize()` Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `text` | `str` | — | Input text. Supports inline tags like `<happy>`. |
| `voice_style` | `VoiceStyle` | — | Voice loaded via `get_voice_style()`. |
| `total_steps` | `int` | `5` | Flow-matching denoising steps. Higher = better quality, slower inference. Range: 1–100. |
| `speed` | `float` | `1.05` | Playback speed multiplier. Range: 0.7–2.0. |
| `lang` | `str` | `"en"` | ISO 639-1 language code, or `"na"` for unknown languages. |
| `max_chunk_length` | `int` | `300` | Max characters per synthesis chunk (120 for Korean). |
| `silence_duration` | `float` | `0.3` | Seconds of silence inserted between chunks. |
| `verbose` | `bool` | `False` | Print per-chunk progress to stdout. |

**Returns:** `(waveform, duration)` — waveform as a `(1, samples)` NumPy array, duration in seconds.

## Examples

**Multilingual synthesis:**

```python
tts = TTS()
style = tts.get_voice_style("Leon")

pairs = [
    ("Merhaba! Bugün hava çok güzel.", "tr"),
    ("Bonjour! Il fait beau aujourd'hui.", "fr"),
    ("こんにちは!今日はいい天気ですね。", "ja"),
]
for text, lang in pairs:
    wav, dur = tts.synthesize(text, voice_style=style, lang=lang)
    tts.save_audio(wav, f"output_{lang}.wav")
```

**Expression tags:**

```python
style = tts.get_voice_style("Lina")
text = "Good news! <happy> We just shipped the feature. <laugh> Don't tell anyone yet."
wav, dur = tts.synthesize(text, voice_style=style, lang="en")
```

**Higher quality with more steps:**

```python
wav, dur = tts.synthesize(
    text="A slow, deliberate reading for maximum clarity.",
    voice_style=style,
    total_steps=30,
    speed=0.9,
    lang="en",
)
```

## Configuration

Model cache location and thread counts can be controlled via environment variables:

| Variable | Description |
|----------|-------------|
| `NEUVOICE_CACHE_DIR` | Override the default cache directory (`~/.cache/neuvoice`). |
| `NEUVOICE_MODEL_REPO` | Override the Hugging Face model repository. |
| `NEUVOICE_REVISION` | Model revision/branch to use (default: `main`). |
| `NEUVOICE_INTRA_THREADS` | ONNX intra-op thread count (default: auto). |
| `NEUVOICE_INTER_THREADS` | ONNX inter-op thread count (default: auto). |

## License

Neuvoice is released under the MIT License. See [LICENSE](LICENSE) for details.

The bundled ONNX model is released under the OpenRAIL-M License.

Copyright © 2026 Lamapi