xtts-gguf / README.md

bnewton-genmedlabs

Initial GGUF implementation with C++ inference engine

4688879 verified 4 months ago

preview code

raw

history blame contribute delete

2.62 kB

metadata

language:
  - en
  - es
  - fr
  - de
  - it
  - pt
  - pl
  - tr
  - ru
  - nl
  - cs
  - ar
  - zh
  - ja
  - ko
  - hu
  - hi
tags:
  - text-to-speech
  - tts
  - xtts
  - gguf
  - quantized
  - mobile
  - embedded
  - cpp
license: apache-2.0

XTTS v2 GGUF - Memory-Efficient TTS for Mobile

🚀 EXPERIMENTAL: GGUF format XTTS v2 with C++ inference engine for ultra-low memory usage on mobile devices.

⚠️ NOTE: This is a proof-of-concept. GGUF files require the included C++ inference engine to run.

🎯 Key Features

Memory-Mapped Loading: Only loads needed parts into RAM
Multiple Quantizations: Q4 (290MB), Q8 (580MB), F16 (1.16GB)
Low RAM Usage: 90-350MB vs 1.5-2.5GB for PyTorch
Fast Loading: <1 second vs 15-20 seconds
React Native Ready: Full mobile integration

📊 Model Variants

Variant	Size	RAM (mmap)	Quality	Best For
`q4_k`	290MB	~90MB	Good	Low-end devices
`q8`	580MB	~180MB	Very Good	Mid-range devices
`f16`	1.16GB	~350MB	Excellent	High-end devices

🚀 Quick Start

React Native

import XTTS from '@genmedlabs/xtts-gguf';

// Initialize (downloads model automatically)
await XTTS.initialize();

// Generate speech
const audio = await XTTS.speak("Hello world!", {
  language: 'en'
});

C++

#include "xtts_inference.h"

auto model = std::make_unique<xtts::XTTSInference>();
model->load_model("xtts_v2_q4_k.gguf", true);
auto audio = model->generate("Hello world!", xtts::LANG_EN);

📦 Repository Structure

gguf/
├── xtts_v2_q4_k.gguf   # 4-bit quantized model
├── xtts_v2_q8.gguf     # 8-bit quantized model
├── xtts_v2_f16.gguf    # 16-bit half precision
└── manifest.json       # Model metadata

cpp/
├── xtts_inference.h    # C++ header
├── xtts_inference.cpp  # Implementation
└── CMakeLists.txt      # Build configuration

react-native/
├── XTTSModule.cpp      # Native module
└── XTTSModule.ts       # TypeScript interface

🔧 Implementation Status

Completed ✅

GGUF format export
C++ engine structure
React Native bridge
Memory-mapped loading

In Progress 🚧

Full transformer implementation
Hardware acceleration
Voice cloning support

TODO 📋

Production optimizations
Comprehensive testing
WebAssembly support

📄 License

Apache 2.0

🙏 Credits

Based on XTTS v2 by Coqui AI. Uses GGML library for efficient inference.

See full documentation in the repository for detailed usage and build instructions.