Lamapi commited on
Commit
e05980b
·
verified ·
1 Parent(s): 555672f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +143 -3
README.md CHANGED
@@ -1,3 +1,143 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Neuvoice — Fast, On-Device Neural TTS by Lamapi
2
+
3
+ **Neuvoice** is a lightweight, on-device text-to-speech library that runs entirely via ONNX Runtime — no cloud calls, no latency surprises. Built by [Lamapi](https://huggingface.co/thelamapi), it ships with a curated voice library and supports 31 languages out of the box.
4
+
5
+ ## Quick Start
6
+
7
+ ```bash
8
+ pip install neuvoice
9
+ ```
10
+
11
+ ```python
12
+ from neuvoice import TTS
13
+
14
+ tts = TTS(auto_download=True)
15
+ style = tts.get_voice_style("Alina")
16
+
17
+ text = "Hello! Welcome to Neuvoice — fast, private, on-device speech synthesis."
18
+ wav, duration = tts.synthesize(text, voice_style=style, lang="en")
19
+
20
+ tts.save_audio(wav, "output.wav")
21
+ print(f"Generated {duration:.2f}s of audio")
22
+ ```
23
+
24
+ On first run, model assets are downloaded and cached automatically under `~/.cache/neuvoice`.
25
+
26
+ ## Highlights
27
+
28
+ **31 supported languages.** Covers a wide range of scripts and regions, from European languages to Arabic, Hindi, Japanese, Korean, Vietnamese, and more.
29
+
30
+ **Runs entirely on-device.** ONNX Runtime powers inference — no API keys, no network dependency after the initial model download. CPU is sufficient; GPU acceleration is supported when available.
31
+
32
+ **Rich voice library.** Ships with voices including Alina, Cem, Cole, Giray, Leon, Lina, Linda, Mustafa, Sarp, Selin, Sema, and Soras. Each voice is a compact style embedding — load any of them by name in a single call.
33
+
34
+ **Inline expression tags.** Embed `<happy>`, `<laugh>`, `<breath>`, `<sad>`, and other tags directly in your text to shape the delivery without any extra parameters.
35
+
36
+ **Long-form synthesis.** Inputs are automatically chunked, synthesized, and rejoined with configurable silence — no manual splitting required.
37
+
38
+ ## Supported Languages
39
+
40
+ | Code | Language | Code | Language | Code | Language | Code | Language |
41
+ |------|----------|------|----------|------|----------|------|----------|
42
+ | `en` | English | `ko` | Korean | `ja` | Japanese | `ar` | Arabic |
43
+ | `bg` | Bulgarian | `cs` | Czech | `da` | Danish | `de` | German |
44
+ | `el` | Greek | `es` | Spanish | `et` | Estonian | `fi` | Finnish |
45
+ | `fr` | French | `hi` | Hindi | `hr` | Croatian | `hu` | Hungarian |
46
+ | `id` | Indonesian | `it` | Italian | `lt` | Lithuanian | `lv` | Latvian |
47
+ | `nl` | Dutch | `pl` | Polish | `pt` | Portuguese | `ro` | Romanian |
48
+ | `ru` | Russian | `sk` | Slovak | `sl` | Slovenian | `sv` | Swedish |
49
+ | `tr` | Turkish | `uk` | Ukrainian | `vi` | Vietnamese | `na` | (fallback) |
50
+
51
+ ## Available Voices
52
+
53
+ ```python
54
+ tts = TTS()
55
+ print(tts.voice_style_names)
56
+ # ['Alina', 'Cem', 'Cole', 'Giray', 'Leon', 'Lina', 'Linda',
57
+ # 'Mustafa', 'Sarp', 'Selin', 'Sema', 'Soras']
58
+ ```
59
+
60
+ Load any voice by name:
61
+
62
+ ```python
63
+ style = tts.get_voice_style("Selin")
64
+ ```
65
+
66
+ Or from a custom style file:
67
+
68
+ ```python
69
+ style = tts.get_voice_style_from_path("/path/to/my_voice.json")
70
+ ```
71
+
72
+ ## `synthesize()` Parameters
73
+
74
+ | Parameter | Type | Default | Description |
75
+ |-----------|------|---------|-------------|
76
+ | `text` | `str` | — | Input text. Supports inline tags like `<happy>`. |
77
+ | `voice_style` | `VoiceStyle` | — | Voice loaded via `get_voice_style()`. |
78
+ | `total_steps` | `int` | `5` | Flow-matching denoising steps. Higher = better quality, slower inference. Range: 1–100. |
79
+ | `speed` | `float` | `1.05` | Playback speed multiplier. Range: 0.7–2.0. |
80
+ | `lang` | `str` | `"en"` | ISO 639-1 language code, or `"na"` for unknown languages. |
81
+ | `max_chunk_length` | `int` | `300` | Max characters per synthesis chunk (120 for Korean). |
82
+ | `silence_duration` | `float` | `0.3` | Seconds of silence inserted between chunks. |
83
+ | `verbose` | `bool` | `False` | Print per-chunk progress to stdout. |
84
+
85
+ **Returns:** `(waveform, duration)` — waveform as a `(1, samples)` NumPy array, duration in seconds.
86
+
87
+ ## Examples
88
+
89
+ **Multilingual synthesis:**
90
+
91
+ ```python
92
+ tts = TTS()
93
+ style = tts.get_voice_style("Leon")
94
+
95
+ pairs = [
96
+ ("Merhaba! Bugün hava çok güzel.", "tr"),
97
+ ("Bonjour! Il fait beau aujourd'hui.", "fr"),
98
+ ("こんにちは!今日はいい天気ですね。", "ja"),
99
+ ]
100
+ for text, lang in pairs:
101
+ wav, dur = tts.synthesize(text, voice_style=style, lang=lang)
102
+ tts.save_audio(wav, f"output_{lang}.wav")
103
+ ```
104
+
105
+ **Expression tags:**
106
+
107
+ ```python
108
+ style = tts.get_voice_style("Lina")
109
+ text = "Good news! <happy> We just shipped the feature. <laugh> Don't tell anyone yet."
110
+ wav, dur = tts.synthesize(text, voice_style=style, lang="en")
111
+ ```
112
+
113
+ **Higher quality with more steps:**
114
+
115
+ ```python
116
+ wav, dur = tts.synthesize(
117
+ text="A slow, deliberate reading for maximum clarity.",
118
+ voice_style=style,
119
+ total_steps=30,
120
+ speed=0.9,
121
+ lang="en",
122
+ )
123
+ ```
124
+
125
+ ## Configuration
126
+
127
+ Model cache location and thread counts can be controlled via environment variables:
128
+
129
+ | Variable | Description |
130
+ |----------|-------------|
131
+ | `NEUVOICE_CACHE_DIR` | Override the default cache directory (`~/.cache/neuvoice`). |
132
+ | `NEUVOICE_MODEL_REPO` | Override the Hugging Face model repository. |
133
+ | `NEUVOICE_REVISION` | Model revision/branch to use (default: `main`). |
134
+ | `NEUVOICE_INTRA_THREADS` | ONNX intra-op thread count (default: auto). |
135
+ | `NEUVOICE_INTER_THREADS` | ONNX inter-op thread count (default: auto). |
136
+
137
+ ## License
138
+
139
+ Neuvoice is released under the MIT License. See [LICENSE](LICENSE) for details.
140
+
141
+ The bundled ONNX model is released under the OpenRAIL-M License.
142
+
143
+ Copyright © 2026 Lamapi