neuvoice / README.md

Update README.md

7d79a54 verified 1 day ago

6.67 kB

	---
	license: openrail
	language:
	- tr
	- bg
	- el
	- fr
	- ko
	- ja
	- ar
	- de
	- da
	- cs
	- es
	- et
	- fi
	- hr
	- hu
	- id
	- it
	- lt
	- lv
	- nl
	- pl
	- pt
	- ro
	- ru
	- sk
	- sl
	- sv
	- uk
	- vi
	- na
	tags:
	- text
	- speech
	- tts
	- neuvoice
	- turkish
	- turkey
	---
	# Neuvoice — Fast, On-Device Neural TTS by Lamapi
	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/QRimUEnTq6ESADGnHCH87.wav"></audio>
	Neuvoice is a lightweight, on-device text-to-speech library that runs entirely via ONNX Runtime — no cloud calls, no latency surprises. (supertonic-3) Modified by [Lamapi](https://huggingface.co/thelamapi), it ships with a curated voice library and supports 31 languages out of the box.


	![IMAGE](https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/kyNY0a11CFQ0VWAGRevpP.png)
	---

	## Examples (Using 'Linda' Voice)

	### Turkish
	`<happy> Bugün kendimi mutlu hissediyorum!`
	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/NYK8ukN9ywI7MAa86Chvp.wav"></audio>

	### English
	`<happy> I feel happy today!`
	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/udPqsDRowN_nV42l37sM7.wav"></audio>

	### French
	`<happy> Je me sens heureux aujourd'hui!`
	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/G71ItTvWBa5Ygu6lA9z72.wav"></audio>

	### Japanese
	`<happy> こんにちは！今日はいい天気ですね。`
	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/dIj08g1OXaq-p1dkRrLvI.wav"></audio>

	---

	## Quick Start

	```bash
	pip install neuvoice
	```

	```python
	from neuvoice import TTS

	tts = TTS(auto_download=True)
	style = tts.get_voice_style("Alina")

	text = "Hello! Welcome to Neuvoice — fast, private, on-device speech synthesis."
	wav, duration = tts.synthesize(text, voice_style=style, lang="en")

	tts.save_audio(wav, "output.wav")
	print(f"Generated {duration:.2f}s of audio")
	```

	On first run, model assets are downloaded and cached automatically under `~/.cache/neuvoice`.

	## Highlights

	31 supported languages. Covers a wide range of scripts and regions, from European languages to Arabic, Hindi, Japanese, Korean, Vietnamese, and more.

	Runs entirely on-device. ONNX Runtime powers inference — no API keys, no network dependency after the initial model download. CPU is sufficient; GPU acceleration is supported when available.

	Rich voice library. Ships with voices including Alina, Cem, Cole, Giray, Leon, Lina, Linda, Mustafa, Sarp, Selin, Sema, and Soras. Each voice is a compact style embedding — load any of them by name in a single call.

	Inline expression tags. Embed `<happy>`, `<laugh>`, `<breath>`, `<sad>`, and other tags directly in your text to shape the delivery without any extra parameters.

	Long-form synthesis. Inputs are automatically chunked, synthesized, and rejoined with configurable silence — no manual splitting required.

	## Supported Languages

	\| Code \| Language \| Code \| Language \| Code \| Language \| Code \| Language \|
	\|------\|----------\|------\|----------\|------\|----------\|------\|----------\|
	\| `en` \| English \| `ko` \| Korean \| `ja` \| Japanese \| `ar` \| Arabic \|
	\| `bg` \| Bulgarian \| `cs` \| Czech \| `da` \| Danish \| `de` \| German \|
	\| `el` \| Greek \| `es` \| Spanish \| `et` \| Estonian \| `fi` \| Finnish \|
	\| `fr` \| French \| `hi` \| Hindi \| `hr` \| Croatian \| `hu` \| Hungarian \|
	\| `id` \| Indonesian \| `it` \| Italian \| `lt` \| Lithuanian \| `lv` \| Latvian \|
	\| `nl` \| Dutch \| `pl` \| Polish \| `pt` \| Portuguese \| `ro` \| Romanian \|
	\| `ru` \| Russian \| `sk` \| Slovak \| `sl` \| Slovenian \| `sv` \| Swedish \|
	\| `tr` \| Turkish \| `uk` \| Ukrainian \| `vi` \| Vietnamese \| `na` \| (fallback) \|

	## Available Voices

	```python
	tts = TTS()
	print(tts.voice_style_names)
	# ['Alina', 'Cem', 'Cole', 'Giray', 'Leon', 'Lina', 'Linda',
	# 'Mustafa', 'Sarp', 'Selin', 'Sema', 'Soras']
	```

	Load any voice by name:

	```python
	style = tts.get_voice_style("Selin")
	```

	Or from a custom style file:

	```python
	style = tts.get_voice_style_from_path("/path/to/my_voice.json")
	```

	## `synthesize()` Parameters

	\| Parameter \| Type \| Default \| Description \|
	\|-----------\|------\|---------\|-------------\|
	\| `text` \| `str` \| — \| Input text. Supports inline tags like `<happy>`. \|
	\| `voice_style` \| `VoiceStyle` \| — \| Voice loaded via `get_voice_style()`. \|
	\| `total_steps` \| `int` \| `5` \| Flow-matching denoising steps. Higher = better quality, slower inference. Range: 1–100. \|
	\| `speed` \| `float` \| `1.05` \| Playback speed multiplier. Range: 0.7–2.0. \|
	\| `lang` \| `str` \| `"en"` \| ISO 639-1 language code, or `"na"` for unknown languages. \|
	\| `max_chunk_length` \| `int` \| `300` \| Max characters per synthesis chunk (120 for Korean). \|
	\| `silence_duration` \| `float` \| `0.3` \| Seconds of silence inserted between chunks. \|
	\| `verbose` \| `bool` \| `False` \| Print per-chunk progress to stdout. \|

	Returns: `(waveform, duration)` — waveform as a `(1, samples)` NumPy array, duration in seconds.

	## Examples

	Multilingual synthesis:

	```python
	tts = TTS()
	style = tts.get_voice_style("Leon")

	pairs = [
	("Merhaba! Bugün hava çok güzel.", "tr"),
	("Bonjour! Il fait beau aujourd'hui.", "fr"),
	("こんにちは！今日はいい天気ですね。", "ja"),
	]
	for text, lang in pairs:
	wav, dur = tts.synthesize(text, voice_style=style, lang=lang)
	tts.save_audio(wav, f"output_{lang}.wav")
	```

	Expression tags:

	```python
	style = tts.get_voice_style("Lina")
	text = "Good news! <happy> We just shipped the feature. <laugh> Don't tell anyone yet."
	wav, dur = tts.synthesize(text, voice_style=style, lang="en")
	```

	Higher quality with more steps:

	```python
	wav, dur = tts.synthesize(
	text="A slow, deliberate reading for maximum clarity.",
	voice_style=style,
	total_steps=30,
	speed=0.9,
	lang="en",
	)
	```

	## Configuration

	Model cache location and thread counts can be controlled via environment variables:

	\| Variable \| Description \|
	\|----------\|-------------\|
	\| `NEUVOICE_CACHE_DIR` \| Override the default cache directory (`~/.cache/neuvoice`). \|
	\| `NEUVOICE_MODEL_REPO` \| Override the Hugging Face model repository. \|
	\| `NEUVOICE_REVISION` \| Model revision/branch to use (default: `main`). \|
	\| `NEUVOICE_INTRA_THREADS` \| ONNX intra-op thread count (default: auto). \|
	\| `NEUVOICE_INTER_THREADS` \| ONNX inter-op thread count (default: auto). \|

	## License

	Neuvoice is released under the MIT License. See [LICENSE](LICENSE) for details.

	The bundled ONNX model is released under the OpenRAIL-M License.

	Copyright © 2026 Lamapi