ATTS1HG1 / README.md
ABBNDZ's picture
README.md
d0ca1fd verified
metadata
language:
  - en
  - es
  - fr
  - de
  - it
  - pt
  - pl
  - tr
  - ru
  - nl
  - cs
  - ar
  - zh
  - ja
  - hu
  - ko
  - hi
pipeline_tag: text-to-speech
tags:
  - text-to-speech
  - tts
  - ggml
  - vulkan
  - c++
  - on-device
license: other
license_name: coqui-public-model-license
license_link: https://coqui.ai/cpml
base_model: coqui/XTTS-v2

ATTS1HG1: High-Performance GGML Implementation of XTTS-v2

ATTS1HG1 is a high-speed, native C++ implementation of the Coqui XTTS-v2 model, utilizing the GGML tensor library. It features a custom integrated HiFiGAN vocoder optimized for Vulkan and CPU inference.

Source Code & GUI Base Model Backend
GitHub: ATTS1HG1 Coqui XTTS-v2 GGML / Vulkan

🚀 Key Features

  • Blazing Fast: Generates audio in < 0.5s on consumer GPUs (RTX 3090) and ~1.0s on CPU.
  • Vulkan Support: Fully optimized HiFiGAN vocoder running on Vulkan (compatible with NVIDIA, AMD, Intel iGPUs).
  • Lightweight: Native C++ application, no heavy Python dependencies (PyTorch/TensorFlow not required at runtime).
  • Multi-Language: Supports 17 languages.
  • Voice : Supports 58 speaker (similar to XTTS).

🌍 Supported Languages

The model supports the following 17 languages:

Code Language Native Name
en English English
es Spanish Español
fr French Français
de German Deutsch
it Italian Italiano
pt Portuguese Português
pl Polish Polski
tr Turkish Türkçe
ru Russian Русский
nl Dutch Nederlands
cs Czech Čeština
ar Arabic العربية
zh Chinese 中文
ja Japanese 日本語
hu Hungarian Magyar
ko Korean 한국어
hi Hindi हिन्दी

⚡ Performance

Benchmarks based on standard text generation ("Bonjour le monde") using the C++ client:

Device Backend Latency (Total) Note
NVIDIA RTX 3090 Vulkan ~0.47s 🚀 Recommended
Intel iGPU Vulkan ~1.40s Good for laptops
CPU (Ryzen/Intel) CPU (AVX2) ~1.02s Solid fallback
NVIDIA RTX 3090 CUDA ~1.45s Slower on HiFiGAN due to kernel overhead

Note: The Vulkan backend is significantly faster for the HiFiGAN part of the pipeline compared to CUDA due to optimized command buffers and reduced kernel launch overhead for small convolutions.

🛠️ Usage

This repository contains the converted .bin / .gguf weights required by the ATTS1HG1 software.

  1. Download the model files from this repository.
  2. Clone and compile the software from GitHub:
    git clone [https://github.com/abbndz/ATTS1HG1](https://github.com/abbndz/ATTS1HG1)
    
  3. Load the model in the GUI or CLI and select Vulkan for best performance.

📜 License

This project uses the weights from Coqui XTTS-v2, which is licensed under the Coqui Public Model License (CPML).

  • Non-commercial use: You can use this model for personal, educational, and non-commercial projects.
  • Commercial use: Requires a license from Coqui (check their repository for details).

The C++ code (inference engine) is available under the MIT License (see GitHub).


Credits: Based on the excellent work by Coqui.ai and the GGML library by ggerganov.