Upload folder using huggingface_hub
Browse files- README.md +40 -0
- vieneu_decoder.onnx +3 -0
- vieneu_decoder_int8.onnx +3 -0
- vieneu_encoder.onnx +3 -0
README.md
ADDED
|
@@ -0,0 +1,40 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- vi
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
# VieNeu-Codec: The Heart of VieNeu-TTS v2
|
| 8 |
+
|
| 9 |
+
**VieNeu-Codec** is the high-performance audio engine built specifically for the upcoming **VieNeu-TTS v2**. It is a neural audio codec trained on over **20,000 hours** of diverse Vietnamese and English speech data, ensuring state-of-the-art robustness, natural prosody, and crystal-clear audio reconstruction.
|
| 10 |
+
|
| 11 |
+
This repository provides the optimized ONNX versions of the VieNeu-Codec for production use.
|
| 12 |
+
|
| 13 |
+
## 🚀 Key Features
|
| 14 |
+
|
| 15 |
+
- **24kHz High-Fidelity**: Crystal clear audio reconstruction optimized for the Vietnamese language.
|
| 16 |
+
- **Zero-Shot Voice Cloning**: Clone any voice with just 5 seconds of reference audio.
|
| 17 |
+
- **Optimized for VieNeu-TTS v2**: Seamlessly integrates with the next-generation LLM backbone of VieNeu-TTS.
|
| 18 |
+
- **Two Deployment Modes**: Includes both FP32 (High Quality) and INT8 (High Speed) decoders.
|
| 19 |
+
|
| 20 |
+
## 📦 Model Components
|
| 21 |
+
|
| 22 |
+
- **`vieneu_decoder.onnx`**: (FP32) High-fidelity audio decoder for maximum quality.
|
| 23 |
+
- **`vieneu_decoder_int8.onnx`**: (INT8) Quantized decoder for fast CPU inference.
|
| 24 |
+
|
| 25 |
+
## 🛠️ Usage
|
| 26 |
+
|
| 27 |
+
### Synthesize Speech
|
| 28 |
+
Combine the speaker embedding with content tokens from your LLM (VieNeu-TTS v2):
|
| 29 |
+
```python
|
| 30 |
+
sess_dec = ort.InferenceSession("vieneu_decoder.onnx")
|
| 31 |
+
audio = sess_dec.run(None, {
|
| 32 |
+
"content_ids": ids,
|
| 33 |
+
"voice": embedding
|
| 34 |
+
})[0]
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
## 📄 License & Attribution
|
| 38 |
+
Author: **Pham Nguyen Ngoc Bao**
|
| 39 |
+
Project: **VieNeu-Codec (for VieNeu-TTS v2)**
|
| 40 |
+
Version: 2.0
|
vieneu_decoder.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2252107f20222cd321154db429a0eb3f81e4e82b7a8bcb8872adb157ed3605d2
|
| 3 |
+
size 345442987
|
vieneu_decoder_int8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0e99548f8d4c219c4771807258bb7d0c9536bcfc1581902580d8640012178889
|
| 3 |
+
size 111523312
|
vieneu_encoder.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ed11494fa09427bf4ce92652c8e35ece9dd9d45042a99e0cf6ea41fd9cf7e86f
|
| 3 |
+
size 117338685
|