--- title: Pocket-TTS 100M emoji: 🔊 colorFrom: green colorTo: blue sdk: gradio sdk_version: 6.2.0 app_file: app.py pinned: true license: apache-2.0 short_description: High quality, efficient voice cloning. Just 100M parameters. --- # Pocket-TTS A lightweight text-to-speech application built with [kyutai/pocket-tts](https://huggingface.co/kyutai/pocket-tts) and Gradio. ## Features - **Fast CPU inference** — ~6x faster than real-time on modern CPUs - **Low latency** — ~200ms to first audio chunk - **Streaming output** — Audio plays as it generates - **Voice cloning** — Use custom voice samples (MP3, WAV, FLAC, etc.) - **Pre-computed embeddings** — Voices work without voice cloning auth on HF Spaces ## Quick Start ```bash pip install -r requirements.txt python app.py ``` Open http://127.0.0.1:7860 in your browser. ## Adding Custom Voices 1. Drop audio files (MP3, WAV, etc.) into the `voices/` directory 2. Restart the app 3. Embeddings are created automatically on first boot (requires HF auth locally) 4. Once created, embeddings are saved to `embeddings/` and work without auth ### Structure ``` Pocket-TTS/ ├── app.py ├── requirements.txt ├── voices/ # Your custom voice audio files │ └── my_voice.mp3 └── embeddings/ # Auto-generated (commit these for HF Spaces) └── my_voice.safetensors ``` ## HuggingFace Spaces Deployment **Option 1: Pre-commit embeddings (no auth needed on Space)** 1. Run the app locally first (with HF auth) to generate embeddings 2. Commit both `voices/` and `embeddings/` directories 3. The Space will use pre-computed embeddings **Option 2: Auto-create embeddings on Space (requires valid token)** 1. Accept terms at https://huggingface.co/kyutai/pocket-tts 2. Add `HF_TOKEN` secret in Space settings (must be a valid token) 3. Embeddings are created automatically on first boot ## Model Info - **Model**: [kyutai/pocket-tts](https://huggingface.co/kyutai/pocket-tts) - **Parameters**: 100M - **Language**: English only - **Sample rate**: 24kHz ## License See the [kyutai/pocket-tts](https://huggingface.co/kyutai/pocket-tts) model card for licensing information.