--- title: Voice Latency Lab emoji: 🎙️ colorFrom: blue colorTo: indigo sdk: docker app_port: 7860 pinned: false license: mit short_description: Voice chat app with STT, Qwen 1.5B, and Kokoro. --- # Voice Latency Lab This is the Hugging Face Docker Space copy of the app. It is intentionally separate from the desktop version and tuned for a **free CPU-first Space**: - STT: `faster-whisper` on CPU - LLM: `Qwen/Qwen2.5-1.5B-Instruct` through local `transformers` - TTS: Kokoro on CPU There is now an optional STT comparison path for: - `nvidia/parakeet-tdt-0.6b-v3` through NeMo on CPU ## Important constraints - This is a **cold-starting Space** on free hardware. - It is expected to be **slower** than the local GPU version. - The local model is configured to **hide reasoning output** and answer briefly for voice use. - The first load is intentionally front-loaded at startup so the reply watchdog does not fire during model warmup. ## Runtime profile The default Space profile lives in: ```bash .env.hf-free.example ``` At startup, `start.sh` copies that file to `.env` if no `.env` already exists. To compare Whisper with Parakeet v3, change: ```bash VOICE_LAB_STT_BACKEND=parakeet-tdt-v3 ``` and keep the default Whisper values if you want to switch back quickly. You can also test Parakeet without Silero gating: ```bash VOICE_LAB_PARAKEET_BYPASS_SILERO_VAD=true ``` That keeps the app turn-taking logic but switches the speech gate to a simpler RMS-based path for the Parakeet backend only. Use the separate Parakeet barge-in knobs if assistant interruption becomes too sensitive: ```bash VOICE_LAB_PARAKEET_BARGE_IN_RMS_THRESHOLD=0.02 VOICE_LAB_PARAKEET_BARGE_IN_START_MS=180 ``` ## What this Space is for - waking on visit - testing a remote phone-friendly voice loop - keeping the desktop-only stack separate ## What this Space is not for - matching the latency of the local GPU setup - expecting Parakeet v3 on free hardware to be fast or cheap - running the desktop-only `my-agent-cli` backend ## Later upgrade path If you switch this Space to paid GPU hardware later, this copy is the place to test: - different STT backends - Parakeet variants - a different local LLM profile without disturbing the original local app.