ATTS1HG1 / README.md

ABBNDZ

README.md

d0ca1fd verified 4 days ago

preview code

raw

history blame contribute delete

3.68 kB

metadata

language:
  - en
  - es
  - fr
  - de
  - it
  - pt
  - pl
  - tr
  - ru
  - nl
  - cs
  - ar
  - zh
  - ja
  - hu
  - ko
  - hi
pipeline_tag: text-to-speech
tags:
  - text-to-speech
  - tts
  - ggml
  - vulkan
  - c++
  - on-device
license: other
license_name: coqui-public-model-license
license_link: https://coqui.ai/cpml
base_model: coqui/XTTS-v2

ATTS1HG1: High-Performance GGML Implementation of XTTS-v2

ATTS1HG1 is a high-speed, native C++ implementation of the Coqui XTTS-v2 model, utilizing the GGML tensor library. It features a custom integrated HiFiGAN vocoder optimized for Vulkan and CPU inference.

Source Code & GUI	Base Model	Backend
GitHub: ATTS1HG1	Coqui XTTS-v2	GGML / Vulkan

🚀 Key Features

Blazing Fast: Generates audio in < 0.5s on consumer GPUs (RTX 3090) and ~1.0s on CPU.
Vulkan Support: Fully optimized HiFiGAN vocoder running on Vulkan (compatible with NVIDIA, AMD, Intel iGPUs).
Lightweight: Native C++ application, no heavy Python dependencies (PyTorch/TensorFlow not required at runtime).
Multi-Language: Supports 17 languages.
Voice : Supports 58 speaker (similar to XTTS).

🌍 Supported Languages

The model supports the following 17 languages:

Code	Language	Native Name
en	English	English
es	Spanish	Español
fr	French	Français
de	German	Deutsch
it	Italian	Italiano
pt	Portuguese	Português
pl	Polish	Polski
tr	Turkish	Türkçe
ru	Russian	Русский
nl	Dutch	Nederlands
cs	Czech	Čeština
ar	Arabic	العربية
zh	Chinese	中文
ja	Japanese	日本語
hu	Hungarian	Magyar
ko	Korean	한국어
hi	Hindi	हिन्दी

⚡ Performance

Benchmarks based on standard text generation ("Bonjour le monde") using the C++ client:

Device	Backend	Latency (Total)	Note
NVIDIA RTX 3090	Vulkan	~0.47s	🚀 Recommended
Intel iGPU	Vulkan	~1.40s	Good for laptops
CPU (Ryzen/Intel)	CPU (AVX2)	~1.02s	Solid fallback
NVIDIA RTX 3090	CUDA	~1.45s	Slower on HiFiGAN due to kernel overhead

Note: The Vulkan backend is significantly faster for the HiFiGAN part of the pipeline compared to CUDA due to optimized command buffers and reduced kernel launch overhead for small convolutions.

🛠️ Usage

This repository contains the converted .bin / .gguf weights required by the ATTS1HG1 software.

Download the model files from this repository.

Clone and compile the software from GitHub:

git clone [https://github.com/abbndz/ATTS1HG1](https://github.com/abbndz/ATTS1HG1)

Load the model in the GUI or CLI and select Vulkan for best performance.

📜 License

This project uses the weights from Coqui XTTS-v2, which is licensed under the Coqui Public Model License (CPML).

Non-commercial use: You can use this model for personal, educational, and non-commercial projects.
Commercial use: Requires a license from Coqui (check their repository for details).

The C++ code (inference engine) is available under the MIT License (see GitHub).

Credits: Based on the excellent work by Coqui.ai and the GGML library by ggerganov.