Text-to-Speech
Transformers
ONNX
PyTorch
text
speech
tts
neuvoice
turkish
turkey
ai voice
text to speech
turkish tts
open source ai
huggingface model
voice cloning
realistic ai voice
speech synthesis
turkish ai
local ai
offline tts
low latency tts
multilingual tts
audio generation
speech model
real time tts
hf model
synthetic voice
Instructions to use thelamapi/neuvoice with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use thelamapi/neuvoice with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="thelamapi/neuvoice")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("thelamapi/neuvoice", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,143 @@
|
|
| 1 |
-
-
|
| 2 |
-
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Neuvoice — Fast, On-Device Neural TTS by Lamapi
|
| 2 |
+
|
| 3 |
+
**Neuvoice** is a lightweight, on-device text-to-speech library that runs entirely via ONNX Runtime — no cloud calls, no latency surprises. Built by [Lamapi](https://huggingface.co/thelamapi), it ships with a curated voice library and supports 31 languages out of the box.
|
| 4 |
+
|
| 5 |
+
## Quick Start
|
| 6 |
+
|
| 7 |
+
```bash
|
| 8 |
+
pip install neuvoice
|
| 9 |
+
```
|
| 10 |
+
|
| 11 |
+
```python
|
| 12 |
+
from neuvoice import TTS
|
| 13 |
+
|
| 14 |
+
tts = TTS(auto_download=True)
|
| 15 |
+
style = tts.get_voice_style("Alina")
|
| 16 |
+
|
| 17 |
+
text = "Hello! Welcome to Neuvoice — fast, private, on-device speech synthesis."
|
| 18 |
+
wav, duration = tts.synthesize(text, voice_style=style, lang="en")
|
| 19 |
+
|
| 20 |
+
tts.save_audio(wav, "output.wav")
|
| 21 |
+
print(f"Generated {duration:.2f}s of audio")
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
On first run, model assets are downloaded and cached automatically under `~/.cache/neuvoice`.
|
| 25 |
+
|
| 26 |
+
## Highlights
|
| 27 |
+
|
| 28 |
+
**31 supported languages.** Covers a wide range of scripts and regions, from European languages to Arabic, Hindi, Japanese, Korean, Vietnamese, and more.
|
| 29 |
+
|
| 30 |
+
**Runs entirely on-device.** ONNX Runtime powers inference — no API keys, no network dependency after the initial model download. CPU is sufficient; GPU acceleration is supported when available.
|
| 31 |
+
|
| 32 |
+
**Rich voice library.** Ships with voices including Alina, Cem, Cole, Giray, Leon, Lina, Linda, Mustafa, Sarp, Selin, Sema, and Soras. Each voice is a compact style embedding — load any of them by name in a single call.
|
| 33 |
+
|
| 34 |
+
**Inline expression tags.** Embed `<happy>`, `<laugh>`, `<breath>`, `<sad>`, and other tags directly in your text to shape the delivery without any extra parameters.
|
| 35 |
+
|
| 36 |
+
**Long-form synthesis.** Inputs are automatically chunked, synthesized, and rejoined with configurable silence — no manual splitting required.
|
| 37 |
+
|
| 38 |
+
## Supported Languages
|
| 39 |
+
|
| 40 |
+
| Code | Language | Code | Language | Code | Language | Code | Language |
|
| 41 |
+
|------|----------|------|----------|------|----------|------|----------|
|
| 42 |
+
| `en` | English | `ko` | Korean | `ja` | Japanese | `ar` | Arabic |
|
| 43 |
+
| `bg` | Bulgarian | `cs` | Czech | `da` | Danish | `de` | German |
|
| 44 |
+
| `el` | Greek | `es` | Spanish | `et` | Estonian | `fi` | Finnish |
|
| 45 |
+
| `fr` | French | `hi` | Hindi | `hr` | Croatian | `hu` | Hungarian |
|
| 46 |
+
| `id` | Indonesian | `it` | Italian | `lt` | Lithuanian | `lv` | Latvian |
|
| 47 |
+
| `nl` | Dutch | `pl` | Polish | `pt` | Portuguese | `ro` | Romanian |
|
| 48 |
+
| `ru` | Russian | `sk` | Slovak | `sl` | Slovenian | `sv` | Swedish |
|
| 49 |
+
| `tr` | Turkish | `uk` | Ukrainian | `vi` | Vietnamese | `na` | (fallback) |
|
| 50 |
+
|
| 51 |
+
## Available Voices
|
| 52 |
+
|
| 53 |
+
```python
|
| 54 |
+
tts = TTS()
|
| 55 |
+
print(tts.voice_style_names)
|
| 56 |
+
# ['Alina', 'Cem', 'Cole', 'Giray', 'Leon', 'Lina', 'Linda',
|
| 57 |
+
# 'Mustafa', 'Sarp', 'Selin', 'Sema', 'Soras']
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
Load any voice by name:
|
| 61 |
+
|
| 62 |
+
```python
|
| 63 |
+
style = tts.get_voice_style("Selin")
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
Or from a custom style file:
|
| 67 |
+
|
| 68 |
+
```python
|
| 69 |
+
style = tts.get_voice_style_from_path("/path/to/my_voice.json")
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
## `synthesize()` Parameters
|
| 73 |
+
|
| 74 |
+
| Parameter | Type | Default | Description |
|
| 75 |
+
|-----------|------|---------|-------------|
|
| 76 |
+
| `text` | `str` | — | Input text. Supports inline tags like `<happy>`. |
|
| 77 |
+
| `voice_style` | `VoiceStyle` | — | Voice loaded via `get_voice_style()`. |
|
| 78 |
+
| `total_steps` | `int` | `5` | Flow-matching denoising steps. Higher = better quality, slower inference. Range: 1–100. |
|
| 79 |
+
| `speed` | `float` | `1.05` | Playback speed multiplier. Range: 0.7–2.0. |
|
| 80 |
+
| `lang` | `str` | `"en"` | ISO 639-1 language code, or `"na"` for unknown languages. |
|
| 81 |
+
| `max_chunk_length` | `int` | `300` | Max characters per synthesis chunk (120 for Korean). |
|
| 82 |
+
| `silence_duration` | `float` | `0.3` | Seconds of silence inserted between chunks. |
|
| 83 |
+
| `verbose` | `bool` | `False` | Print per-chunk progress to stdout. |
|
| 84 |
+
|
| 85 |
+
**Returns:** `(waveform, duration)` — waveform as a `(1, samples)` NumPy array, duration in seconds.
|
| 86 |
+
|
| 87 |
+
## Examples
|
| 88 |
+
|
| 89 |
+
**Multilingual synthesis:**
|
| 90 |
+
|
| 91 |
+
```python
|
| 92 |
+
tts = TTS()
|
| 93 |
+
style = tts.get_voice_style("Leon")
|
| 94 |
+
|
| 95 |
+
pairs = [
|
| 96 |
+
("Merhaba! Bugün hava çok güzel.", "tr"),
|
| 97 |
+
("Bonjour! Il fait beau aujourd'hui.", "fr"),
|
| 98 |
+
("こんにちは!今日はいい天気ですね。", "ja"),
|
| 99 |
+
]
|
| 100 |
+
for text, lang in pairs:
|
| 101 |
+
wav, dur = tts.synthesize(text, voice_style=style, lang=lang)
|
| 102 |
+
tts.save_audio(wav, f"output_{lang}.wav")
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
**Expression tags:**
|
| 106 |
+
|
| 107 |
+
```python
|
| 108 |
+
style = tts.get_voice_style("Lina")
|
| 109 |
+
text = "Good news! <happy> We just shipped the feature. <laugh> Don't tell anyone yet."
|
| 110 |
+
wav, dur = tts.synthesize(text, voice_style=style, lang="en")
|
| 111 |
+
```
|
| 112 |
+
|
| 113 |
+
**Higher quality with more steps:**
|
| 114 |
+
|
| 115 |
+
```python
|
| 116 |
+
wav, dur = tts.synthesize(
|
| 117 |
+
text="A slow, deliberate reading for maximum clarity.",
|
| 118 |
+
voice_style=style,
|
| 119 |
+
total_steps=30,
|
| 120 |
+
speed=0.9,
|
| 121 |
+
lang="en",
|
| 122 |
+
)
|
| 123 |
+
```
|
| 124 |
+
|
| 125 |
+
## Configuration
|
| 126 |
+
|
| 127 |
+
Model cache location and thread counts can be controlled via environment variables:
|
| 128 |
+
|
| 129 |
+
| Variable | Description |
|
| 130 |
+
|----------|-------------|
|
| 131 |
+
| `NEUVOICE_CACHE_DIR` | Override the default cache directory (`~/.cache/neuvoice`). |
|
| 132 |
+
| `NEUVOICE_MODEL_REPO` | Override the Hugging Face model repository. |
|
| 133 |
+
| `NEUVOICE_REVISION` | Model revision/branch to use (default: `main`). |
|
| 134 |
+
| `NEUVOICE_INTRA_THREADS` | ONNX intra-op thread count (default: auto). |
|
| 135 |
+
| `NEUVOICE_INTER_THREADS` | ONNX inter-op thread count (default: auto). |
|
| 136 |
+
|
| 137 |
+
## License
|
| 138 |
+
|
| 139 |
+
Neuvoice is released under the MIT License. See [LICENSE](LICENSE) for details.
|
| 140 |
+
|
| 141 |
+
The bundled ONNX model is released under the OpenRAIL-M License.
|
| 142 |
+
|
| 143 |
+
Copyright © 2026 Lamapi
|