voice-agent / README.md
RalphThings's picture
Deploy Hugging Face Space
a8bcb70
---
title: Voice Latency Lab
emoji: 🎙️
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Voice chat app with STT, Qwen 1.5B, and Kokoro.
---
# Voice Latency Lab
This is the Hugging Face Docker Space copy of the app.
It is intentionally separate from the desktop version and tuned for a **free CPU-first Space**:
- STT: `faster-whisper` on CPU
- LLM: `Qwen/Qwen2.5-1.5B-Instruct` through local `transformers`
- TTS: Kokoro on CPU
There is now an optional STT comparison path for:
- `nvidia/parakeet-tdt-0.6b-v3` through NeMo on CPU
## Important constraints
- This is a **cold-starting Space** on free hardware.
- It is expected to be **slower** than the local GPU version.
- The local model is configured to **hide reasoning output** and answer briefly for voice use.
- The first load is intentionally front-loaded at startup so the reply watchdog does not fire during model warmup.
## Runtime profile
The default Space profile lives in:
```bash
.env.hf-free.example
```
At startup, `start.sh` copies that file to `.env` if no `.env` already exists.
To compare Whisper with Parakeet v3, change:
```bash
VOICE_LAB_STT_BACKEND=parakeet-tdt-v3
```
and keep the default Whisper values if you want to switch back quickly.
You can also test Parakeet without Silero gating:
```bash
VOICE_LAB_PARAKEET_BYPASS_SILERO_VAD=true
```
That keeps the app turn-taking logic but switches the speech gate to a simpler RMS-based path for the Parakeet backend only.
Use the separate Parakeet barge-in knobs if assistant interruption becomes too sensitive:
```bash
VOICE_LAB_PARAKEET_BARGE_IN_RMS_THRESHOLD=0.02
VOICE_LAB_PARAKEET_BARGE_IN_START_MS=180
```
## What this Space is for
- waking on visit
- testing a remote phone-friendly voice loop
- keeping the desktop-only stack separate
## What this Space is not for
- matching the latency of the local GPU setup
- expecting Parakeet v3 on free hardware to be fast or cheap
- running the desktop-only `my-agent-cli` backend
## Later upgrade path
If you switch this Space to paid GPU hardware later, this copy is the place to test:
- different STT backends
- Parakeet variants
- a different local LLM profile
without disturbing the original local app.