Instructions to use thanhtantran/VieNeu-TTS with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use thanhtantran/VieNeu-TTS with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="thanhtantran/VieNeu-TTS", filename="VieNeu-TTS-q4_0.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use thanhtantran/VieNeu-TTS with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf thanhtantran/VieNeu-TTS:Q4_0 # Run inference directly in the terminal: llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf thanhtantran/VieNeu-TTS:Q4_0 # Run inference directly in the terminal: llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf thanhtantran/VieNeu-TTS:Q4_0 # Run inference directly in the terminal: ./llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf thanhtantran/VieNeu-TTS:Q4_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0
Use Docker
docker model run hf.co/thanhtantran/VieNeu-TTS:Q4_0
- LM Studio
- Jan
- Ollama
How to use thanhtantran/VieNeu-TTS with Ollama:
ollama run hf.co/thanhtantran/VieNeu-TTS:Q4_0
- Unsloth Studio new
How to use thanhtantran/VieNeu-TTS with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for thanhtantran/VieNeu-TTS to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for thanhtantran/VieNeu-TTS to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for thanhtantran/VieNeu-TTS to start chatting
- Docker Model Runner
How to use thanhtantran/VieNeu-TTS with Docker Model Runner:
docker model run hf.co/thanhtantran/VieNeu-TTS:Q4_0
- Lemonade
How to use thanhtantran/VieNeu-TTS with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull thanhtantran/VieNeu-TTS:Q4_0
Run and chat with the model
lemonade run user.VieNeu-TTS-Q4_0
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf thanhtantran/VieNeu-TTS:Q4_0# Run inference directly in the terminal:
llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf thanhtantran/VieNeu-TTS:Q4_0# Run inference directly in the terminal:
./llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf thanhtantran/VieNeu-TTS:Q4_0# Run inference directly in the terminal:
./build/bin/llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0Use Docker
docker model run hf.co/thanhtantran/VieNeu-TTS:Q4_0Configuration Parsing Warning:Invalid JSON for config file tokenizer_config.json
Overview
VieNeu-TTS is an advanced on-device Vietnamese Text-to-Speech (TTS) model with instant voice cloning.
Trained on ~1000 hours of high-quality Vietnamese speech, this model represents a significant upgrade from VieNeu-TTS-140h with the following improvements:
- Enhanced pronunciation: More accurate and stable Vietnamese pronunciation
- Code-switching support: Seamless transitions between Vietnamese and English
- Better voice cloning: Higher fidelity and speaker consistency
- Real-time synthesis: 24 kHz waveform generation on CPU or GPU
VieNeu-TTS-1000h delivers production-ready speech synthesis fully offline.
Author: Phạm Nguyễn Ngọc Bảo
Support This Project
Training high-quality TTS models requires significant GPU resources and compute time. If you find this model useful, please consider supporting the development:
Your support helps maintain and improve VieNeu-TTS! 🙏
Reference Voices
| File | Gender | Accent | Description |
|---|---|---|---|
| Bình (nam miền Bắc) | Male | North | Male voice, North accent |
| Tuyên (nam miền Bắc) | Male | North | Male voice, North accent |
| Nguyên (nam miền Nam) | Male | South | Male voice, South accent |
| Sơn (nam miền Nam) | Male | South | Male voice, South accent |
| Vĩnh (nam miền Nam) | Male | South | Male voice, South accent |
| Hương (nữ miền Bắc) | Female | North | Female voice, North accent |
| Ly (nữ miền Bắc) | Female | North | Female voice, North accent |
| Ngọc (nữ miền Bắc) | Female | North | Female voice, North accent |
| Đoan (nữ miền Nam) | Female | South | Female voice, South accent |
| Dung (nữ miền Nam) | Female | South | Female voice, South accent |
Model Architecture
| Component | Description |
|---|---|
| Backbone | Qwen 0.5B (chat-format LM) |
| Codec | NeuCodec (supports ONNX + quantization) |
| Output | 24 kHz waveform synthesis |
| Context Window | 2048 tokens shared text + speech |
| Watermark | Enabled |
| Training Data | VieNeuCodec-dataset + Emilia dataset pretraining |
Features
- High-quality Vietnamese speech
- Instant voice cloning (3–5 second reference audio)
- Fully offline
- Runs real-time or faster
- Multi-voice reference support
- Python API + CLI + Gradio
Troubleshooting
| Issue | Cause | Solution |
|---|---|---|
Missing libespeak |
System dependency | Install eSpeak NG |
| GPU OOM | VRAM too small | Use CPU or quantized model |
| Poor voice match | Bad reference sample | Try a clearer reference clip |
License
Apache 2.0
Citation
@misc{vieneutts2025,
title = {VieNeu-TTS: Vietnamese Text-to-Speech with Instant Voice Cloning},
author = {Pham Nguyen Ngoc Bao},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/pnnbao-ump/VieNeu-TTS}}
}
Please also cite the base model:
@misc{neuttsair2025,
title = {NeuTTS Air: On-Device Speech Language Model with Instant Voice Cloning},
author = {Neuphonic},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/neuphonic/neutts-air}}
}
- Downloads last month
- 27
Model tree for thanhtantran/VieNeu-TTS
Base model
neuphonic/neutts-air
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf thanhtantran/VieNeu-TTS:Q4_0# Run inference directly in the terminal: llama-cli -hf thanhtantran/VieNeu-TTS:Q4_0