valtecAI-team commited on
Commit
48cee3a
·
verified ·
1 Parent(s): b556535

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +89 -10
README.md CHANGED
@@ -1,10 +1,89 @@
1
- ---
2
- title: Valtec Vietnamese Tts Web
3
- emoji: 🏃
4
- colorFrom: purple
5
- colorTo: pink
6
- sdk: static
7
- pinned: false
8
- ---
9
-
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Valtec Vietnamese TTS Web Demo
3
+ emoji: 🌐
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: static
7
+ pinned: false
8
+ license: mit
9
+ ---
10
+
11
+ # Valtec Vietnamese TTS - Browser Demo
12
+
13
+ 🌐 **Vietnamese Text-to-Speech Running Entirely in Your Browser**
14
+
15
+ This demo uses ONNX Runtime Web to run Vietnamese TTS completely in your browser - no server required!
16
+
17
+ ## Features
18
+
19
+ - ✅ **100% Browser-Based**: All processing happens in your browser
20
+ - ✅ **No Backend**: Direct ONNX model inference using WebAssembly
21
+ - ✅ **5 Vietnamese Voices**: NF, SF, NM1, SM, NM2 (Northern/Southern accents)
22
+ - ✅ **Fast Loading**: Models cached after first load (~165MB)
23
+ - ✅ **Privacy-First**: Your text never leaves your browser
24
+
25
+ ## How It Works
26
+
27
+ 1. **First Load**: Downloads ONNX models from HuggingFace Hub (~165MB)
28
+ 2. **Text Input**: Enter any Vietnamese text
29
+ 3. **Voice Selection**: Choose from 5 regional voices
30
+ 4. **Real-Time Synthesis**: ONNX Runtime Web generates audio in browser
31
+ 5. **Instant Playback**: Listen to synthesized speech
32
+
33
+ ## Available Voices
34
+
35
+ | Voice | Region | Gender | Description |
36
+ |-------|--------|--------|-------------|
37
+ | **NF** | Northern (Bắc) | Female | Clear, formal |
38
+ | **SF** | Southern (Nam) | Female | Warm, friendly |
39
+ | **NM1** | Northern (Bắc) | Male | Professional |
40
+ | **SM** | Southern (Nam) | Male | Conversational |
41
+ | **NM2** | Northern (Bắc) | Male | Authoritative |
42
+
43
+ ## Technical Details
44
+
45
+ ### ONNX Pipeline
46
+ - **Text Encoder**: Phoneme encoding
47
+ - **Duration Predictor**: Speech timing
48
+ - **Flow Model**: Latent transformation
49
+ - **Decoder**: Audio waveform generation (HiFi-GAN)
50
+
51
+ ### Vietnamese G2P
52
+ - Uses ported viphoneme library in JavaScript
53
+ - Accurate tone and phoneme mapping
54
+ - 99.96% accuracy vs Python reference
55
+
56
+ ### Browser Requirements
57
+ - Chrome 90+, Firefox 90+, Edge 90+ (Full support)
58
+ - Safari 15+ (Limited support)
59
+ - WebAssembly and AudioContext API required
60
+
61
+ ## Model Info
62
+
63
+ - **Architecture**: VITS (Conditional VAE)
64
+ - **Sample Rate**: 24kHz
65
+ - **Model Size**: 164.75 MB (ONNX)
66
+ - **Speakers**: 5 (Northern/Southern Vietnamese)
67
+
68
+ ## Performance
69
+
70
+ First load may take 30-60 seconds to download models. Subsequent visits are instant (cached).
71
+
72
+ Synthesis speed depends on device:
73
+ - Desktop: ~5-8 seconds per sentence
74
+ - Mobile: ~10-15 seconds per sentence
75
+
76
+ ## Links
77
+
78
+ - 🎤 [Gradio Demo](https://huggingface.co/spaces/valtecAI-team/valtec-vietnamese-tts) - Full featured demo
79
+ - 📦 [ONNX Models](https://huggingface.co/valtecAI-team/valtec-tts-onnx) - Pre-trained models
80
+ - 🏠 [GitHub](https://github.com/valtecAI-team/valtec-tts) - Source code
81
+ - 📱 [Android App](https://github.com/valtecAI-team/valtec-tts/tree/main/deployments/android) - Mobile deployment
82
+
83
+ ## Privacy
84
+
85
+ All processing happens locally in your browser. No data is sent to any server. Your text input and generated audio never leave your device.
86
+
87
+ ---
88
+
89
+ **Powered by Valtec AI Team** | ONNX Runtime Web | WebAssembly