Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.8.0
metadata
title: Pocket-TTS 100M
emoji: π
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: true
license: apache-2.0
short_description: High quality, efficient voice cloning. Just 100M parameters.
Pocket-TTS
A lightweight text-to-speech application built with kyutai/pocket-tts and Gradio.
Features
- Fast CPU inference β ~6x faster than real-time on modern CPUs
- Low latency β ~200ms to first audio chunk
- Streaming output β Audio plays as it generates
- Voice cloning β Use custom voice samples (MP3, WAV, FLAC, etc.)
- Pre-computed embeddings β Voices work without voice cloning auth on HF Spaces
Quick Start
pip install -r requirements.txt
python app.py
Open http://127.0.0.1:7860 in your browser.
Adding Custom Voices
- Drop audio files (MP3, WAV, etc.) into the
voices/directory - Restart the app
- Embeddings are created automatically on first boot (requires HF auth locally)
- Once created, embeddings are saved to
embeddings/and work without auth
Structure
Pocket-TTS/
βββ app.py
βββ requirements.txt
βββ voices/ # Your custom voice audio files
β βββ my_voice.mp3
βββ embeddings/ # Auto-generated (commit these for HF Spaces)
βββ my_voice.safetensors
HuggingFace Spaces Deployment
Option 1: Pre-commit embeddings (no auth needed on Space)
- Run the app locally first (with HF auth) to generate embeddings
- Commit both
voices/andembeddings/directories - The Space will use pre-computed embeddings
Option 2: Auto-create embeddings on Space (requires valid token)
- Accept terms at https://huggingface.co/kyutai/pocket-tts
- Add
HF_TOKENsecret in Space settings (must be a valid token) - Embeddings are created automatically on first boot
Model Info
- Model: kyutai/pocket-tts
- Parameters: 100M
- Language: English only
- Sample rate: 24kHz
License
See the kyutai/pocket-tts model card for licensing information.