Spaces:

valtecAI-team
/

valtec-vietnamese-tts-web

Running

App Files Files Community

valtecAI-team commited on Dec 30, 2025

Commit

48cee3a

verified ·

1 Parent(s): b556535

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +89 -10

README.md CHANGED Viewed

@@ -1,10 +1,89 @@
----
-title: Valtec Vietnamese Tts Web
-emoji: 🏃
-colorFrom: purple
-colorTo: pink
-sdk: static
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+---
+title: Valtec Vietnamese TTS Web Demo
+emoji: 🌐
+colorFrom: blue
+colorTo: purple
+sdk: static
+pinned: false
+license: mit
+---
+# Valtec Vietnamese TTS - Browser Demo
+🌐 **Vietnamese Text-to-Speech Running Entirely in Your Browser**
+This demo uses ONNX Runtime Web to run Vietnamese TTS completely in your browser - no server required!
+## Features
+- ✅ **100% Browser-Based**: All processing happens in your browser
+- ✅ **No Backend**: Direct ONNX model inference using WebAssembly
+- ✅ **5 Vietnamese Voices**: NF, SF, NM1, SM, NM2 (Northern/Southern accents)
+- ✅ **Fast Loading**: Models cached after first load (~165MB)
+- ✅ **Privacy-First**: Your text never leaves your browser
+## How It Works
+1. **First Load**: Downloads ONNX models from HuggingFace Hub (~165MB)
+2. **Text Input**: Enter any Vietnamese text
+3. **Voice Selection**: Choose from 5 regional voices
+4. **Real-Time Synthesis**: ONNX Runtime Web generates audio in browser
+5. **Instant Playback**: Listen to synthesized speech
+## Available Voices
+| Voice | Region | Gender | Description |
+|-------|--------|--------|-------------|
+| **NF** | Northern (Bắc) | Female | Clear, formal |
+| **SF** | Southern (Nam) | Female | Warm, friendly |
+| **NM1** | Northern (Bắc) | Male | Professional |
+| **SM** | Southern (Nam) | Male | Conversational |
+| **NM2** | Northern (Bắc) | Male | Authoritative |
+## Technical Details
+### ONNX Pipeline
+- **Text Encoder**: Phoneme encoding
+- **Duration Predictor**: Speech timing
+- **Flow Model**: Latent transformation
+- **Decoder**: Audio waveform generation (HiFi-GAN)
+### Vietnamese G2P
+- Uses ported viphoneme library in JavaScript
+- Accurate tone and phoneme mapping
+- 99.96% accuracy vs Python reference
+### Browser Requirements
+- Chrome 90+, Firefox 90+, Edge 90+ (Full support)
+- Safari 15+ (Limited support)
+- WebAssembly and AudioContext API required
+## Model Info
+- **Architecture**: VITS (Conditional VAE)
+- **Sample Rate**: 24kHz
+- **Model Size**: 164.75 MB (ONNX)
+- **Speakers**: 5 (Northern/Southern Vietnamese)
+## Performance
+First load may take 30-60 seconds to download models. Subsequent visits are instant (cached).
+Synthesis speed depends on device:
+- Desktop: ~5-8 seconds per sentence
+- Mobile: ~10-15 seconds per sentence
+## Links
+- 🎤 [Gradio Demo](https://huggingface.co/spaces/valtecAI-team/valtec-vietnamese-tts) - Full featured demo
+- 📦 [ONNX Models](https://huggingface.co/valtecAI-team/valtec-tts-onnx) - Pre-trained models
+- 🏠 [GitHub](https://github.com/valtecAI-team/valtec-tts) - Source code
+- 📱 [Android App](https://github.com/valtecAI-team/valtec-tts/tree/main/deployments/android) - Mobile deployment
+## Privacy
+All processing happens locally in your browser. No data is sent to any server. Your text input and generated audio never leave your device.
+---
+**Powered by Valtec AI Team** | ONNX Runtime Web | WebAssembly