|
|
--- |
|
|
language: |
|
|
- en |
|
|
- es |
|
|
- fr |
|
|
- de |
|
|
- it |
|
|
- pt |
|
|
- pl |
|
|
- tr |
|
|
- ru |
|
|
- nl |
|
|
- cs |
|
|
- ar |
|
|
- zh |
|
|
- ja |
|
|
- ko |
|
|
- hu |
|
|
- hi |
|
|
tags: |
|
|
- text-to-speech |
|
|
- tts |
|
|
- xtts |
|
|
- gguf |
|
|
- quantized |
|
|
- mobile |
|
|
- embedded |
|
|
- cpp |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# XTTS v2 GGUF - Memory-Efficient TTS for Mobile |
|
|
|
|
|
π **EXPERIMENTAL**: GGUF format XTTS v2 with C++ inference engine for ultra-low memory usage on mobile devices. |
|
|
|
|
|
> β οΈ **NOTE**: This is a proof-of-concept. GGUF files require the included C++ inference engine to run. |
|
|
|
|
|
## π― Key Features |
|
|
|
|
|
- **Memory-Mapped Loading**: Only loads needed parts into RAM |
|
|
- **Multiple Quantizations**: Q4 (290MB), Q8 (580MB), F16 (1.16GB) |
|
|
- **Low RAM Usage**: 90-350MB vs 1.5-2.5GB for PyTorch |
|
|
- **Fast Loading**: <1 second vs 15-20 seconds |
|
|
- **React Native Ready**: Full mobile integration |
|
|
|
|
|
## π Model Variants |
|
|
|
|
|
| Variant | Size | RAM (mmap) | Quality | Best For | |
|
|
|---------|------|------------|---------|----------| |
|
|
| `q4_k` | 290MB | ~90MB | Good | Low-end devices | |
|
|
| `q8` | 580MB | ~180MB | Very Good | Mid-range devices | |
|
|
| `f16` | 1.16GB | ~350MB | Excellent | High-end devices | |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### React Native |
|
|
|
|
|
```javascript |
|
|
import XTTS from '@genmedlabs/xtts-gguf'; |
|
|
|
|
|
// Initialize (downloads model automatically) |
|
|
await XTTS.initialize(); |
|
|
|
|
|
// Generate speech |
|
|
const audio = await XTTS.speak("Hello world!", { |
|
|
language: 'en' |
|
|
}); |
|
|
``` |
|
|
|
|
|
### C++ |
|
|
|
|
|
```cpp |
|
|
#include "xtts_inference.h" |
|
|
|
|
|
auto model = std::make_unique<xtts::XTTSInference>(); |
|
|
model->load_model("xtts_v2_q4_k.gguf", true); |
|
|
auto audio = model->generate("Hello world!", xtts::LANG_EN); |
|
|
``` |
|
|
|
|
|
## π¦ Repository Structure |
|
|
|
|
|
``` |
|
|
gguf/ |
|
|
βββ xtts_v2_q4_k.gguf # 4-bit quantized model |
|
|
βββ xtts_v2_q8.gguf # 8-bit quantized model |
|
|
βββ xtts_v2_f16.gguf # 16-bit half precision |
|
|
βββ manifest.json # Model metadata |
|
|
|
|
|
cpp/ |
|
|
βββ xtts_inference.h # C++ header |
|
|
βββ xtts_inference.cpp # Implementation |
|
|
βββ CMakeLists.txt # Build configuration |
|
|
|
|
|
react-native/ |
|
|
βββ XTTSModule.cpp # Native module |
|
|
βββ XTTSModule.ts # TypeScript interface |
|
|
``` |
|
|
|
|
|
## π§ Implementation Status |
|
|
|
|
|
### Completed β
|
|
|
- GGUF format export |
|
|
- C++ engine structure |
|
|
- React Native bridge |
|
|
- Memory-mapped loading |
|
|
|
|
|
### In Progress π§ |
|
|
- Full transformer implementation |
|
|
- Hardware acceleration |
|
|
- Voice cloning support |
|
|
|
|
|
### TODO π |
|
|
- Production optimizations |
|
|
- Comprehensive testing |
|
|
- WebAssembly support |
|
|
|
|
|
## π License |
|
|
|
|
|
Apache 2.0 |
|
|
|
|
|
## π Credits |
|
|
|
|
|
Based on XTTS v2 by Coqui AI. Uses GGML library for efficient inference. |
|
|
|
|
|
--- |
|
|
|
|
|
**See full documentation in the repository for detailed usage and build instructions.** |
|
|
|