Pocket-TTS / README.md
Nymbo's picture
Update README.md
ad9ea82 verified

A newer version of the Gradio SDK is available: 6.8.0

Upgrade
metadata
title: Pocket-TTS 100M
emoji: πŸ”Š
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: true
license: apache-2.0
short_description: High quality, efficient voice cloning. Just 100M parameters.

Pocket-TTS

A lightweight text-to-speech application built with kyutai/pocket-tts and Gradio.

Features

  • Fast CPU inference β€” ~6x faster than real-time on modern CPUs
  • Low latency β€” ~200ms to first audio chunk
  • Streaming output β€” Audio plays as it generates
  • Voice cloning β€” Use custom voice samples (MP3, WAV, FLAC, etc.)
  • Pre-computed embeddings β€” Voices work without voice cloning auth on HF Spaces

Quick Start

pip install -r requirements.txt
python app.py

Open http://127.0.0.1:7860 in your browser.

Adding Custom Voices

  1. Drop audio files (MP3, WAV, etc.) into the voices/ directory
  2. Restart the app
  3. Embeddings are created automatically on first boot (requires HF auth locally)
  4. Once created, embeddings are saved to embeddings/ and work without auth

Structure

Pocket-TTS/
β”œβ”€β”€ app.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ voices/           # Your custom voice audio files
β”‚   └── my_voice.mp3
└── embeddings/       # Auto-generated (commit these for HF Spaces)
    └── my_voice.safetensors

HuggingFace Spaces Deployment

Option 1: Pre-commit embeddings (no auth needed on Space)

  1. Run the app locally first (with HF auth) to generate embeddings
  2. Commit both voices/ and embeddings/ directories
  3. The Space will use pre-computed embeddings

Option 2: Auto-create embeddings on Space (requires valid token)

  1. Accept terms at https://huggingface.co/kyutai/pocket-tts
  2. Add HF_TOKEN secret in Space settings (must be a valid token)
  3. Embeddings are created automatically on first boot

Model Info

  • Model: kyutai/pocket-tts
  • Parameters: 100M
  • Language: English only
  • Sample rate: 24kHz

License

See the kyutai/pocket-tts model card for licensing information.