A newer version of the Gradio SDK is available: 6.13.0
metadata
title: Valtec Zero-Shot Vietnamese Voice Cloning
emoji: 🎙️
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 4.36.0
python_version: '3.10'
app_file: app_zeroshot.py
pinned: false
license: cc-by-nc-4.0
Valtec Zero-Shot Vietnamese Voice Cloning
🎙️ The lightest Vietnamese zero-shot voice cloning model — only 74.8M parameters, runs entirely on CPU.
Clone any voice from just 3-10 seconds of reference audio. No fine-tuning. No GPU required.
Features
- Ultra-lightweight: 74.8M params — the lightest Vietnamese voice cloning model
- CPU-friendly: 3x faster than realtime on CPU alone
- Zero-shot: Clone any voice from a short audio clip
- 6 Built-in Voices: Thu Hà, Minh Đức, Thanh Tâm, Quang Huy, Ngọc Ánh, Hoàng Nam
- Custom Upload: Upload your own reference audio to clone any voice
Usage
- Enter Vietnamese text
- Select a reference voice or upload your own audio (3-10 seconds)
- Click "Clone Voice"
- Listen to the cloned output
Model Specs
| Component | Parameters |
|---|---|
| Synthesizer | 56.45M |
| Speaker Encoder | 8.03M |
| Style Encoder | 7.80M |
| Prosody Predictor | 2.52M |
| Total | 74.80M |
Performance (CPU)
| Input Length | RTF | Speed |
|---|---|---|
| Short | 0.475 | 2.1x realtime |
| Medium | 0.286 | 3.5x realtime |
| Long | 0.236 | 4.2x realtime |
Links
License
CC BY-NC 4.0 — Non-commercial use only.
Powered by Valtec AI Team