Spaces:
Sleeping
Sleeping
File size: 2,197 Bytes
f944d36 ad9ea82 20b63d2 77dfc08 f944d36 20b63d2 f944d36 5492fcb f944d36 4dea557 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | ---
title: Pocket-TTS 100M
emoji: π
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: true
license: apache-2.0
short_description: High quality, efficient voice cloning. Just 100M parameters.
---
# Pocket-TTS
A lightweight text-to-speech application built with [kyutai/pocket-tts](https://huggingface.co/kyutai/pocket-tts) and Gradio.
## Features
- **Fast CPU inference** β ~6x faster than real-time on modern CPUs
- **Low latency** β ~200ms to first audio chunk
- **Streaming output** β Audio plays as it generates
- **Voice cloning** β Use custom voice samples (MP3, WAV, FLAC, etc.)
- **Pre-computed embeddings** β Voices work without voice cloning auth on HF Spaces
## Quick Start
```bash
pip install -r requirements.txt
python app.py
```
Open http://127.0.0.1:7860 in your browser.
## Adding Custom Voices
1. Drop audio files (MP3, WAV, etc.) into the `voices/` directory
2. Restart the app
3. Embeddings are created automatically on first boot (requires HF auth locally)
4. Once created, embeddings are saved to `embeddings/` and work without auth
### Structure
```
Pocket-TTS/
βββ app.py
βββ requirements.txt
βββ voices/ # Your custom voice audio files
β βββ my_voice.mp3
βββ embeddings/ # Auto-generated (commit these for HF Spaces)
βββ my_voice.safetensors
```
## HuggingFace Spaces Deployment
**Option 1: Pre-commit embeddings (no auth needed on Space)**
1. Run the app locally first (with HF auth) to generate embeddings
2. Commit both `voices/` and `embeddings/` directories
3. The Space will use pre-computed embeddings
**Option 2: Auto-create embeddings on Space (requires valid token)**
1. Accept terms at https://huggingface.co/kyutai/pocket-tts
2. Add `HF_TOKEN` secret in Space settings (must be a valid token)
3. Embeddings are created automatically on first boot
## Model Info
- **Model**: [kyutai/pocket-tts](https://huggingface.co/kyutai/pocket-tts)
- **Parameters**: 100M
- **Language**: English only
- **Sample rate**: 24kHz
## License
See the [kyutai/pocket-tts](https://huggingface.co/kyutai/pocket-tts) model card for licensing information.
|