Update README.md - Ultimate TTS with 900+ voices

#17
by masbudjj - opened
Files changed (1) hide show
  1. README.md +261 -273
README.md CHANGED
@@ -1,346 +1,334 @@
1
  ---
2
- title: Advanced TTS - Real Voices + Voice Cloning
3
  emoji: 🎙️
4
- colorFrom: indigo
5
  colorTo: purple
6
  sdk: static
7
- pinned: false
8
  license: apache-2.0
9
  ---
10
 
11
- # 🎙️ Advanced Text-to-Speech System
12
 
13
- **7 Authentic Voices + Voice Cloning + Unlimited Text - 100% Browser-Based**
14
 
15
- ## ✨ Key Features
16
 
17
- ### 🎭 Dual Voice Modes
18
 
19
- #### 📚 Preset Voices (7 Authentic Speakers)
20
- Real speaker embeddings from the CMU ARCTIC dataset:
 
 
21
 
22
- **🇺🇸 American Voices:**
23
- - **Sarah (slt)** - Female, Clear & Professional
24
- - **Clara (clb)** - Female, Warm & Friendly
25
- - **Ben (bdl)** - Male, Deep & Authoritative
26
- - **Robert (rms)** - Male, Calm & Relaxed
27
 
28
- **🌍 International Voices:**
29
- - **Andrew (awb)** - Scottish Male, Distinguished
30
- - **James (jmk)** - Canadian Male, Friendly
31
- - **Kiran (ksp)** - Indian Male, Professional
32
 
33
- #### 🎤 Voice Cloning Mode
34
- Upload your own voice sample (up to 1 minute) and the system will:
35
- - Extract voice characteristics
36
- - Auto-compress large files
37
- - Resample to optimal quality (16kHz)
38
- - Convert stereo to mono
39
- - Generate 512-dim voice embedding
40
 
41
- **Supported formats:** WAV, MP3
42
- **Max duration:** 60 seconds (auto-trim)
43
- **Processing:** Automatic compression & resampling
 
 
 
 
 
44
 
45
- ---
46
-
47
- ## 📝 Unlimited Text Processing
48
-
49
- ### Smart Chunking System
50
- - **Automatic splitting** - Intelligently splits by sentences
51
- - **200 chars per chunk** - Optimal for quality & speed
52
- - **Seamless concatenation** - Merges all chunks into single audio
53
- - **Real-time progress** - Track each chunk being processed
54
-
55
- **No character limits!** Type as much text as you want.
56
-
57
- ---
58
 
59
- ## 🎨 Advanced Features
60
 
61
- ### ⚙️ Audio Controls
62
- - **Speed Control** - 0.5x to 2.0x playback speed
63
- - **Real-time adjustment** - Change speed during playback
 
64
 
65
- ### 📊 Live Monitoring
66
- - **Character counter** - Total text length
67
- - **Word counter** - Word count
68
- - **Chunk calculator** - Estimated processing chunks
69
- - **Progress bar** - Visual generation progress
70
- - **Activity log** - Detailed processing steps
71
 
72
- ### 💾 Download & Playback
73
- - **Browser audio player** - Built-in controls
74
- - **WAV format** - High-quality 16-bit PCM
75
- - **Download option** - Save generated audio
76
 
77
- ---
78
 
79
- ## 🏗️ Technical Architecture
80
-
81
- ### Model & Runtime
82
- - **Base Model:** Microsoft SpeechT5 (Xenova/speecht5_tts)
83
- - **Runtime:** ONNX Runtime (WebAssembly)
84
- - **Framework:** Transformers.js 3.1.2
85
- - **Execution:** 100% client-side (no server)
86
-
87
- ### Voice System
88
- - **Speaker Embeddings:** 512-dimensional x-vectors
89
- - **Dataset:** CMU ARCTIC (7 speakers)
90
- - **Cloning:** Web Audio API + spectral analysis
91
- - **Format:** Float32Array, normalized
92
-
93
- ### Audio Processing
94
- ```javascript
95
- Input Audio
96
-
97
- Duration Check (trim if > 60s)
98
-
99
- Resample to 16kHz
100
-
101
- Convert to Mono
102
-
103
- Extract Features (mean, variance, spectral)
104
-
105
- Generate 512-dim Embedding
106
-
107
- Normalize (L2 norm)
108
-
109
- Ready for TTS
110
- ```
111
-
112
- ### Text Processing Pipeline
113
- ```javascript
114
- User Input Text
115
-
116
- Split by Sentences
117
-
118
- Group into 200-char Chunks
119
-
120
- Process Each Chunk:
121
- - Generate with TTS
122
- - Use selected voice embedding
123
- - Update progress
124
-
125
- Concatenate All Audio
126
-
127
- Encode to WAV
128
-
129
- Present to User
130
- ```
131
 
132
- ---
 
 
133
 
134
- ## 🚀 How It Works
135
-
136
- ### Preset Voice Generation
137
- 1. Select voice from dropdown (e.g., "Sarah - Female")
138
- 2. Enter text (unlimited length)
139
- 3. Click "Generate Speech"
140
- 4. System splits text into chunks
141
- 5. Processes each chunk with selected voice
142
- 6. Concatenates all audio
143
- 7. Presents final WAV file
144
-
145
- ### Voice Cloning Workflow
146
- 1. Switch to "Voice Clone" mode
147
- 2. Upload voice sample (WAV/MP3, max 60s)
148
- 3. Click "Process Voice Sample"
149
- 4. System extracts voice characteristics
150
- 5. Enter text to generate
151
- 6. Click "Generate Speech"
152
- 7. Your voice clone reads the text!
153
 
154
- ---
155
 
156
- ## 💻 Browser Requirements
 
 
 
 
157
 
158
- **Minimum Requirements:**
159
- - Modern browser (Chrome 90+, Firefox 88+, Safari 14+)
160
- - JavaScript enabled
161
- - ~100MB RAM for model
162
- - ~50MB storage for model cache
163
 
164
- **Optimal Experience:**
165
- - Chrome/Edge with WebGPU support
166
- - 4GB+ RAM
167
- - Fast internet (first load only)
168
 
169
- ---
 
 
 
 
 
 
 
 
 
 
170
 
171
- ## 📊 Performance
172
 
173
- | Metric | Value |
174
- |--------|-------|
175
- | **Model Size** | ~50MB (cached after first load) |
176
- | **Voice Load Time** | ~5-10s (first time only) |
177
- | **Generation Speed** | ~2-5s per 200 chars |
178
- | **Sample Rate** | 16kHz |
179
- | **Audio Format** | WAV (16-bit PCM) |
180
- | **Max Text Length** | Unlimited (chunked) |
181
 
182
- ---
 
 
 
 
 
 
 
 
183
 
184
  ## 🎯 Use Cases
185
 
186
- ### Professional
187
- - **Corporate videos** - Ben (authoritative), Robert (calm)
188
- - **Training materials** - Sarah (clear), Kiran (professional)
189
- - **Presentations** - Clara (warm), James (friendly)
190
-
191
- ### Creative
192
- - **Audiobooks** - Andrew (distinguished), Robert (relaxed)
193
- - **Podcasts** - Use voice cloning for consistency
194
- - **Voice-overs** - Multiple character voices
195
 
196
  ### Accessibility
197
- - **Screen readers** - Clear, natural voices
198
- - **Language learning** - Different accents
199
- - **Content accessibility** - Convert text to audio
 
200
 
201
- ---
 
 
 
 
202
 
203
  ## 🔧 Technical Details
204
 
205
- ### Voice Embedding Extraction (Cloning)
206
- ```javascript
207
- // Simplified process
208
- 1. Load audio file
209
- 2. Decode to AudioBuffer
210
- 3. Resample to 16kHz if needed
211
- 4. Convert stereo → mono
212
- 5. Split into 512 chunks
213
- 6. Calculate mean & variance per chunk
214
- 7. Combine to create embedding
215
- 8. Normalize (L2 norm = 1)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
216
  ```
217
 
218
- ### Chunking Algorithm
219
- ```javascript
220
- function chunkText(text, maxChars = 200) {
221
- // Split by sentence boundaries
222
- const sentences = text.match(/[^.!?]+[.!?]+/g);
223
-
224
- // Group sentences into chunks ≤ maxChars
225
- const chunks = [];
226
- let currentChunk = "";
227
-
228
- for (const sentence of sentences) {
229
- if ((currentChunk + sentence).length <= maxChars) {
230
- currentChunk += sentence;
231
- } else {
232
- chunks.push(currentChunk.trim());
233
- currentChunk = sentence;
234
- }
235
- }
236
-
237
- return chunks;
238
- }
239
  ```
240
 
241
- ### Audio Concatenation
242
- ```javascript
243
- function concatenateAudio(audioArrays, sampleRate) {
244
- // Calculate total length
245
- const totalLength = audioArrays.reduce((sum, arr) =>
246
- sum + arr.length, 0);
 
 
247
 
248
- // Merge all chunks
249
- const result = new Float32Array(totalLength);
250
- let offset = 0;
251
 
252
- for (const arr of audioArrays) {
253
- result.set(arr, offset);
254
- offset += arr.length;
255
- }
256
 
257
- return result;
258
- }
259
- ```
 
 
 
260
 
261
- ---
 
262
 
263
- ## 🌟 Advantages
 
 
 
 
 
264
 
265
- **Privacy-Focused** - All processing in your browser
266
- **No Server Costs** - No backend infrastructure needed
267
- ✅ **Offline Capable** - Works after initial model download
268
- ✅ **Unlimited Usage** - No API limits or quotas
269
- ✅ **Fast Generation** - Optimized chunking for speed
270
- ✅ **High Quality** - Microsoft SpeechT5 architecture
271
- ✅ **Free & Open** - Apache 2.0 license
272
 
273
- ---
 
 
 
 
 
274
 
275
- ## 📝 Limitations
 
276
 
277
- ⚠️ **Voice Cloning Accuracy** - Simplified algorithm (not production-grade)
278
- ⚠️ **First Load Time** - ~50MB model download
279
- ⚠️ **Browser Only** - Requires modern web browser
280
- ⚠️ **English Optimized** - Best results with English text
281
- ⚠️ **Memory Usage** - Large texts require more RAM
282
 
283
- ---
284
 
285
- ## 🔍 Comparison
 
 
 
286
 
287
- | Feature | This App | Standard SpeechT5 | Cloud TTS APIs |
288
- |---------|----------|-------------------|----------------|
289
- | **Voices** | 7 real + cloning | 1 default | 100+ |
290
- | **Text Length** | Unlimited | Limited | Varies |
291
- | **Voice Cloning** | ✅ Yes | ❌ No | ✅ Yes (paid) |
292
- | **Privacy** | ✅ 100% local | ✅ 100% local | ❌ Cloud |
293
- | **Cost** | Free | Free | Paid |
294
- | **Internet** | First load only | First load only | Always |
295
- | **Chunking** | ✅ Automatic | ❌ Manual | ✅ Handled |
296
 
297
- ---
 
 
298
 
299
- ## 🛠️ Development
 
300
 
301
- ### Project Structure
302
- ```
303
- .
304
- ├── index.html # Main application
305
- ├── assets/
306
- │ └── style.css # Modern UI styling
307
- ├── README.md # This file
308
- └── upload_script.py # Hugging Face upload utility
309
- ```
310
 
311
- ### Technology Stack
312
- - **Frontend:** Vanilla JavaScript (ES6+)
313
- - **ML Framework:** Transformers.js
314
- - **Runtime:** ONNX Runtime (WASM)
315
- - **Audio Processing:** Web Audio API
316
- - **Model:** Xenova/speecht5_tts
317
- - **Embeddings:** CMU ARCTIC x-vectors
318
 
319
- ---
 
 
 
 
320
 
321
- ## 📄 License
322
 
323
- Apache 2.0 - Free for personal and commercial use
 
324
 
325
- ---
326
 
327
- ## 🙏 Credits
 
 
 
328
 
329
- - **SpeechT5 Model:** Microsoft Research
330
- - **ONNX Conversion:** Xenova/transformers.js
331
- - **Speaker Dataset:** CMU ARCTIC
332
- - **UI Design:** Modern glassmorphism
333
- - **Voice Cloning:** Web Audio API
334
 
335
- ---
 
336
 
337
- ## 📚 Resources
338
 
339
- - [Transformers.js Docs](https://huggingface.co/docs/transformers.js)
340
- - [SpeechT5 Paper](https://arxiv.org/abs/2110.07205)
341
- - [CMU ARCTIC Dataset](http://www.festvox.org/cmu_arctic/)
342
- - [Web Audio API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API)
 
 
 
 
 
 
 
 
 
 
 
 
 
343
 
344
  ---
345
 
346
- **Built with ❤️ using Transformers.js - Bringing AI to the Browser**
 
 
 
1
  ---
2
+ title: Ultimate TTS Studio - 900+ Premium Voices
3
  emoji: 🎙️
4
+ colorFrom: blue
5
  colorTo: purple
6
  sdk: static
7
+ pinned: true
8
  license: apache-2.0
9
  ---
10
 
11
+ # 🎙️ Ultimate TTS Studio
12
 
13
+ **900+ Premium Voices** from 3 World-Class TTS Engines - All Running in Your Browser!
14
 
15
+ ## ✨ Features
16
 
17
+ ### 🎯 3 Premium TTS Engines
18
 
19
+ 1. **🎯 Piper TTS** - 904 voices across 50+ languages
20
+ - High-quality multilingual support
21
+ - Multiple quality levels (High/Medium/Low)
22
+ - 3-5x realtime generation speed
23
 
24
+ 2. ** Kokoro TTS** - 21 expressive voices (Highest Quality)
25
+ - 24kHz studio-quality audio
26
+ - American & British accents
27
+ - Most natural & expressive
 
28
 
29
+ 3. ** Kitten TTS** - 8 voices (Fastest)
30
+ - Only 24MB model size
31
+ - Lightning-fast generation
32
+ - Perfect for quick tasks
33
 
34
+ ### 🚀 Key Capabilities
 
 
 
 
 
 
35
 
36
+ - ✅ **900+ Professional Voices** - Choose from massive variety
37
+ - ✅ **50+ Languages** - Speak in any language with Piper
38
+ - ✅ **Unlimited Text Length** - Automatic smart chunking
39
+ - ✅ **WebGPU Acceleration** - Hardware-accelerated when available
40
+ - ✅ **Zero Server Cost** - 100% client-side processing
41
+ - ✅ **Offline Capable** - Works after models cached
42
+ - ✅ **Privacy First** - No data leaves your browser
43
+ - ✅ **Professional Quality** - Up to 24kHz audio output
44
 
45
+ ## 🎮 How to Use
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
+ ### 1. Select Your Engine
48
 
49
+ **For Maximum Variety:** Choose **Piper TTS**
50
+ - 904 voices across 50+ languages
51
+ - Select quality level (High/Medium/Low)
52
+ - Pick language and accent
53
 
54
+ **For Best Quality:** Choose **Kokoro TTS**
55
+ - 21 expressive voices
56
+ - Studio-quality 24kHz audio
57
+ - Perfect for audiobooks & narration
 
 
58
 
59
+ **For Speed:** Choose **Kitten TTS**
60
+ - 8 fast voices
61
+ - Lightweight model (24MB)
62
+ - Quick generation
63
 
64
+ ### 2. Configure Voice
65
 
66
+ #### Piper Options:
67
+ - **Quality:** High (22kHz) / Medium (16kHz) / Low (Fast)
68
+ - **Languages:** English (US/GB), Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, + 40 more!
69
+ - **Top Voices:** Lessac, Ryan (US) | Cori, Alan (GB)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
 
71
+ #### Kokoro Options:
72
+ - **American:** Bella, Nicole, Sarah, Sky, Adam, Michael
73
+ - **British:** Emma, Isabella, George, Lewis
74
 
75
+ #### Kitten Options:
76
+ - 8 voices (Voice 0-7) with different characteristics
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
+ ### 3. Enter Text & Generate
79
 
80
+ 1. Type or paste your text (unlimited length)
81
+ 2. Adjust speed if needed (0.5x - 2.0x)
82
+ 3. Click "🎤 Generate Speech"
83
+ 4. Wait for generation (watch progress bar)
84
+ 5. Play audio or download as WAV
85
 
86
+ ## 🌐 Supported Languages
 
 
 
 
87
 
88
+ ### Piper TTS - 50+ Languages:
 
 
 
89
 
90
+ **Major Languages:**
91
+ - 🇺🇸 English (US) - 20+ voices
92
+ - 🇬🇧 English (UK) - 15+ voices
93
+ - 🇪🇸 Spanish - 30+ voices
94
+ - 🇫🇷 French - 25+ voices
95
+ - 🇩🇪 German - 20+ voices
96
+ - 🇮🇹 Italian - 15+ voices
97
+ - 🇵🇹 Portuguese - 10+ voices
98
+ - 🇨🇳 Chinese - 10+ voices
99
+ - 🇯🇵 Japanese - 5+ voices
100
+ - 🇰🇷 Korean - 5+ voices
101
 
102
+ Plus: Dutch, Russian, Polish, Turkish, Arabic, Hindi, Vietnamese, Thai, and many more!
103
 
104
+ ## 📊 Engine Comparison
 
 
 
 
 
 
 
105
 
106
+ | Feature | Piper | Kokoro | Kitten |
107
+ |---------|-------|--------|--------|
108
+ | **Voices** | 904 | 21 | 8 |
109
+ | **Quality** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
110
+ | **Speed** | Medium | Medium | Fast |
111
+ | **Model Size** | ~50MB | ~80MB | ~24MB |
112
+ | **Languages** | 50+ | English | English |
113
+ | **Sample Rate** | 16-22kHz | 24kHz | 16kHz |
114
+ | **Best For** | Variety | Quality | Speed |
115
 
116
  ## 🎯 Use Cases
117
 
118
+ ### Content Creation
119
+ - 🎬 Video voiceovers & narration
120
+ - 📚 Audiobook production
121
+ - 🎙️ Podcast intros/outros
122
+ - 📺 YouTube tutorials
 
 
 
 
123
 
124
  ### Accessibility
125
+ - 👁️ Screen reader alternatives
126
+ - 📖 Reading assistance
127
+ - 🌍 Language learning
128
+ - 📱 Audio content for visually impaired
129
 
130
+ ### Development
131
+ - 🤖 Voice UI prototyping
132
+ - 🎮 Game character voices
133
+ - 📞 IVR system testing
134
+ - 💬 Chatbot voice responses
135
 
136
  ## 🔧 Technical Details
137
 
138
+ ### Technology Stack
139
+ - **Frontend:** Pure HTML5 + JavaScript (ES6+)
140
+ - **TTS Library:** onnx-tts-web
141
+ - **Runtime:** ONNX Runtime Web
142
+ - **Acceleration:** WebGPU / WebAssembly
143
+ - **Audio:** Web Audio API
144
+
145
+ ### Model Sources
146
+ - **Piper:** [rhasspy/piper-voices](https://huggingface.co/rhasspy/piper-voices)
147
+ - **Kokoro:** [therealtimex/kokoro-tts-web](https://huggingface.co/therealtimex/kokoro-tts-web)
148
+ - **Kitten:** [therealtimex/kitten-tts-web](https://huggingface.co/therealtimex/kitten-tts-web)
149
+
150
+ ### Browser Requirements
151
+ - **Minimum:** Chrome 90+ / Firefox 88+ / Safari 14+ / Edge 90+
152
+ - **Recommended:** Latest Chrome/Edge with WebGPU enabled
153
+ - **Features Required:** WebAssembly, Web Audio API
154
+ - **Optional:** WebGPU for acceleration
155
+
156
+ ### Performance
157
+ - **Model Loading:** 5-15 seconds (first time only, then cached)
158
+ - **Generation Speed:** 2-5 seconds per 200 characters
159
+ - **Real-time Factor:** 3-10x (depending on hardware & engine)
160
+ - **Memory Usage:** ~200-500MB (with models loaded)
161
+
162
+ ## 💡 Performance Tips
163
+
164
+ ### For Best Quality:
165
+ 1. Use **Kokoro TTS** for English content
166
+ 2. Select **High Quality** in Piper settings
167
+ 3. Use well-punctuated text
168
+ 4. Keep sentences moderate length
169
+
170
+ ### For Best Speed:
171
+ 1. Use **Kitten TTS** for quick tasks
172
+ 2. Select **Low Quality** in Piper
173
+ 3. Enable WebGPU in browser settings
174
+ 4. Use shorter text inputs
175
+
176
+ ### For Most Options:
177
+ 1. Use **Piper TTS** for language variety
178
+ 2. Explore different accents/regions
179
+ 3. Compare quality levels
180
+ 4. Try multiple voices for same language
181
+
182
+ ## 🎬 Quick Start Examples
183
+
184
+ ### Example 1: Professional Audiobook
185
+ ```
186
+ Engine: Kokoro TTS
187
+ Voice: Bella (American Female)
188
+ Speed: 0.95x
189
+ Quality: 24kHz
190
+ Text: Your book chapter...
191
+ ```
192
+
193
+ ### Example 2: Tutorial Narration
194
+ ```
195
+ Engine: Piper TTS
196
+ Voice: Lessac (US, High Quality)
197
+ Speed: 1.0x
198
+ Quality: 22kHz
199
+ Text: Your tutorial script...
200
  ```
201
 
202
+ ### Example 3: Quick Announcement
203
+ ```
204
+ Engine: Kitten TTS
205
+ Voice: Voice 4 (Clear)
206
+ Speed: 1.1x
207
+ Text: Your announcement...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
208
  ```
209
 
210
+ ### Example 4: Spanish Content
211
+ ```
212
+ Engine: Piper TTS
213
+ Voice: es_ES (Spain Spanish)
214
+ Speed: 1.0x
215
+ Quality: High
216
+ Text: Su texto en español...
217
+ ```
218
 
219
+ ## 🐛 Troubleshooting
 
 
220
 
221
+ ### Model Loading Issues
222
+ **Problem:** "ERROR initializing" message
 
 
223
 
224
+ **Solutions:**
225
+ - Check internet connection
226
+ - Wait for download to complete
227
+ - Try different quality level
228
+ - Clear browser cache
229
+ - Refresh page
230
 
231
+ ### No Audio Output
232
+ **Problem:** Player appears but no sound
233
 
234
+ **Solutions:**
235
+ - Check browser audio permissions
236
+ - Verify volume settings
237
+ - Try different voice/engine
238
+ - Check browser console (F12)
239
+ - Test with different browser
240
 
241
+ ### Slow Performance
242
+ **Problem:** Generation takes too long
 
 
 
 
 
243
 
244
+ **Solutions:**
245
+ - Switch to Kitten TTS for speed
246
+ - Lower quality in Piper settings
247
+ - Enable WebGPU (`chrome://flags`)
248
+ - Update browser to latest version
249
+ - Close other tabs/applications
250
 
251
+ ### WebGPU Not Available
252
+ **Problem:** Shows "WASM" instead of "WebGPU"
253
 
254
+ **Solutions:**
255
+ - Update browser to latest version
256
+ - Enable in `chrome://flags` → "WebGPU"
257
+ - Check GPU driver updates
258
+ - WebGPU optional, WASM works fine
259
 
260
+ ## 🎯 Voice Recommendations
261
 
262
+ ### English (US) - Natural:
263
+ - **Lessac** (Piper) - Professional, clear
264
+ - **Ryan** (Piper) - Authoritative, deep
265
+ - **Bella** (Kokoro) - Elegant, sophisticated
266
 
267
+ ### English (GB) - British:
268
+ - **Cori** (Piper) - Refined, professional
269
+ - **Emma** (Kokoro) - Elegant, polished
270
+ - **George** (Kokoro) - Commanding, distinguished
 
 
 
 
 
271
 
272
+ ### Spanish:
273
+ - **es_ES** (Piper) - Spain Spanish, multiple voices
274
+ - **es_MX** (Piper) - Mexican Spanish
275
 
276
+ ### French:
277
+ - **fr_FR** (Piper) - France French, multiple voices
278
 
279
+ ### German:
280
+ - **de_DE** (Piper) - German, multiple voices
 
 
 
 
 
 
 
281
 
282
+ ## 📝 Privacy & Security
 
 
 
 
 
 
283
 
284
+ ✅ **100% Client-Side** - All processing in your browser
285
+ ✅ **No Server Upload** - Text never leaves your device
286
+ ✅ **No Data Collection** - Zero analytics or tracking
287
+ ✅ **No Account Required** - Use instantly, no signup
288
+ ✅ **Offline Capable** - Works without internet (after cache)
289
 
290
+ ## 📜 License & Credits
291
 
292
+ ### License
293
+ This project is released under the **Apache 2.0 License**.
294
 
295
+ ### Credits & Acknowledgments
296
 
297
+ **Libraries & Tools:**
298
+ - [onnx-tts-web](https://github.com/therealtimex/onnx-tts-web) by @therealtimex
299
+ - [Piper TTS](https://github.com/rhasspy/piper) by Rhasspy
300
+ - [ONNX Runtime](https://onnxruntime.ai/) by Microsoft
301
 
302
+ **Models:**
303
+ - Piper TTS models by Rhasspy team
304
+ - Kokoro TTS by community contributors
305
+ - Kitten TTS by community contributors
 
306
 
307
+ **Inspiration:**
308
+ - [TTS Studio](https://github.com/clowerweb/tts-studio) by @clowerweb
309
 
310
+ ## 🚀 Future Enhancements
311
 
312
+ Planned features:
313
+ - [ ] More TTS engines (Coqui, VITS)
314
+ - [ ] Voice cloning with SpeechT5
315
+ - [ ] SSML markup support
316
+ - [ ] Batch processing
317
+ - [ ] MP3/OGG export
318
+ - [ ] Voice mixing/blending
319
+ - [ ] Real-time streaming
320
+ - [ ] Pronunciation dictionary
321
+
322
+ ## 🤝 Contributing
323
+
324
+ Found a bug or have a suggestion? Please open an issue or submit a pull request!
325
+
326
+ ## 🌟 Star This Space!
327
+
328
+ If you find this useful, please give it a ⭐ star on HuggingFace!
329
 
330
  ---
331
 
332
+ **Made with ❤️ for the open-source community**
333
+
334
+ **Enjoy creating amazing voice content! 🎙️**