| --- |
| license: openrail |
| language: |
| - tr |
| - bg |
| - el |
| - fr |
| - ko |
| - ja |
| - ar |
| - de |
| - da |
| - cs |
| - es |
| - et |
| - fi |
| - hr |
| - hu |
| - id |
| - it |
| - lt |
| - lv |
| - nl |
| - pl |
| - pt |
| - ro |
| - ru |
| - sk |
| - sl |
| - sv |
| - uk |
| - vi |
| - na |
| tags: |
| - text |
| - speech |
| - tts |
| - neuvoice |
| - turkish |
| - turkey |
| --- |
| # Neuvoice — Fast, On-Device Neural TTS by Lamapi |
| <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/QRimUEnTq6ESADGnHCH87.wav"></audio> |
| **Neuvoice** is a lightweight, on-device text-to-speech library that runs entirely via ONNX Runtime — no cloud calls, no latency surprises. (supertonic-3) Modified by [Lamapi](https://huggingface.co/thelamapi), it ships with a curated voice library and supports 31 languages out of the box. |
|
|
|
|
|  |
| --- |
|
|
| ## Examples (Using 'Linda' Voice) |
|
|
| ### Turkish |
| `<happy> Bugün kendimi mutlu hissediyorum!` |
| <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/NYK8ukN9ywI7MAa86Chvp.wav"></audio> |
|
|
| ### English |
| `<happy> I feel happy today!` |
| <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/udPqsDRowN_nV42l37sM7.wav"></audio> |
|
|
| ### French |
| `<happy> Je me sens heureux aujourd'hui!` |
| <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/G71ItTvWBa5Ygu6lA9z72.wav"></audio> |
|
|
| ### Japanese |
| `<happy> こんにちは!今日はいい天気ですね。` |
| <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/dIj08g1OXaq-p1dkRrLvI.wav"></audio> |
|
|
| --- |
|
|
| ## Quick Start |
|
|
| ```bash |
| pip install neuvoice |
| ``` |
|
|
| ```python |
| from neuvoice import TTS |
| |
| tts = TTS(auto_download=True) |
| style = tts.get_voice_style("Alina") |
| |
| text = "Hello! Welcome to Neuvoice — fast, private, on-device speech synthesis." |
| wav, duration = tts.synthesize(text, voice_style=style, lang="en") |
| |
| tts.save_audio(wav, "output.wav") |
| print(f"Generated {duration:.2f}s of audio") |
| ``` |
|
|
| On first run, model assets are downloaded and cached automatically under `~/.cache/neuvoice`. |
|
|
| ## Highlights |
|
|
| **31 supported languages.** Covers a wide range of scripts and regions, from European languages to Arabic, Hindi, Japanese, Korean, Vietnamese, and more. |
|
|
| **Runs entirely on-device.** ONNX Runtime powers inference — no API keys, no network dependency after the initial model download. CPU is sufficient; GPU acceleration is supported when available. |
|
|
| **Rich voice library.** Ships with voices including Alina, Cem, Cole, Giray, Leon, Lina, Linda, Mustafa, Sarp, Selin, Sema, and Soras. Each voice is a compact style embedding — load any of them by name in a single call. |
|
|
| **Inline expression tags.** Embed `<happy>`, `<laugh>`, `<breath>`, `<sad>`, and other tags directly in your text to shape the delivery without any extra parameters. |
|
|
| **Long-form synthesis.** Inputs are automatically chunked, synthesized, and rejoined with configurable silence — no manual splitting required. |
|
|
| ## Supported Languages |
|
|
| | Code | Language | Code | Language | Code | Language | Code | Language | |
| |------|----------|------|----------|------|----------|------|----------| |
| | `en` | English | `ko` | Korean | `ja` | Japanese | `ar` | Arabic | |
| | `bg` | Bulgarian | `cs` | Czech | `da` | Danish | `de` | German | |
| | `el` | Greek | `es` | Spanish | `et` | Estonian | `fi` | Finnish | |
| | `fr` | French | `hi` | Hindi | `hr` | Croatian | `hu` | Hungarian | |
| | `id` | Indonesian | `it` | Italian | `lt` | Lithuanian | `lv` | Latvian | |
| | `nl` | Dutch | `pl` | Polish | `pt` | Portuguese | `ro` | Romanian | |
| | `ru` | Russian | `sk` | Slovak | `sl` | Slovenian | `sv` | Swedish | |
| | `tr` | Turkish | `uk` | Ukrainian | `vi` | Vietnamese | `na` | (fallback) | |
|
|
| ## Available Voices |
|
|
| ```python |
| tts = TTS() |
| print(tts.voice_style_names) |
| # ['Alina', 'Cem', 'Cole', 'Giray', 'Leon', 'Lina', 'Linda', |
| # 'Mustafa', 'Sarp', 'Selin', 'Sema', 'Soras'] |
| ``` |
|
|
| Load any voice by name: |
|
|
| ```python |
| style = tts.get_voice_style("Selin") |
| ``` |
|
|
| Or from a custom style file: |
|
|
| ```python |
| style = tts.get_voice_style_from_path("/path/to/my_voice.json") |
| ``` |
|
|
| ## `synthesize()` Parameters |
|
|
| | Parameter | Type | Default | Description | |
| |-----------|------|---------|-------------| |
| | `text` | `str` | — | Input text. Supports inline tags like `<happy>`. | |
| | `voice_style` | `VoiceStyle` | — | Voice loaded via `get_voice_style()`. | |
| | `total_steps` | `int` | `5` | Flow-matching denoising steps. Higher = better quality, slower inference. Range: 1–100. | |
| | `speed` | `float` | `1.05` | Playback speed multiplier. Range: 0.7–2.0. | |
| | `lang` | `str` | `"en"` | ISO 639-1 language code, or `"na"` for unknown languages. | |
| | `max_chunk_length` | `int` | `300` | Max characters per synthesis chunk (120 for Korean). | |
| | `silence_duration` | `float` | `0.3` | Seconds of silence inserted between chunks. | |
| | `verbose` | `bool` | `False` | Print per-chunk progress to stdout. | |
|
|
| **Returns:** `(waveform, duration)` — waveform as a `(1, samples)` NumPy array, duration in seconds. |
|
|
| ## Examples |
|
|
| **Multilingual synthesis:** |
|
|
| ```python |
| tts = TTS() |
| style = tts.get_voice_style("Leon") |
| |
| pairs = [ |
| ("Merhaba! Bugün hava çok güzel.", "tr"), |
| ("Bonjour! Il fait beau aujourd'hui.", "fr"), |
| ("こんにちは!今日はいい天気ですね。", "ja"), |
| ] |
| for text, lang in pairs: |
| wav, dur = tts.synthesize(text, voice_style=style, lang=lang) |
| tts.save_audio(wav, f"output_{lang}.wav") |
| ``` |
|
|
| **Expression tags:** |
|
|
| ```python |
| style = tts.get_voice_style("Lina") |
| text = "Good news! <happy> We just shipped the feature. <laugh> Don't tell anyone yet." |
| wav, dur = tts.synthesize(text, voice_style=style, lang="en") |
| ``` |
|
|
| **Higher quality with more steps:** |
|
|
| ```python |
| wav, dur = tts.synthesize( |
| text="A slow, deliberate reading for maximum clarity.", |
| voice_style=style, |
| total_steps=30, |
| speed=0.9, |
| lang="en", |
| ) |
| ``` |
|
|
| ## Configuration |
|
|
| Model cache location and thread counts can be controlled via environment variables: |
|
|
| | Variable | Description | |
| |----------|-------------| |
| | `NEUVOICE_CACHE_DIR` | Override the default cache directory (`~/.cache/neuvoice`). | |
| | `NEUVOICE_MODEL_REPO` | Override the Hugging Face model repository. | |
| | `NEUVOICE_REVISION` | Model revision/branch to use (default: `main`). | |
| | `NEUVOICE_INTRA_THREADS` | ONNX intra-op thread count (default: auto). | |
| | `NEUVOICE_INTER_THREADS` | ONNX inter-op thread count (default: auto). | |
|
|
| ## License |
|
|
| Neuvoice is released under the MIT License. See [LICENSE](LICENSE) for details. |
|
|
| The bundled ONNX model is released under the OpenRAIL-M License. |
|
|
| Copyright © 2026 Lamapi |