Spaces:
Sleeping
Sleeping
| title: Voice Latency Lab | |
| emoji: 🎙️ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: mit | |
| short_description: Voice chat app with STT, Qwen 1.5B, and Kokoro. | |
| # Voice Latency Lab | |
| This is the Hugging Face Docker Space copy of the app. | |
| It is intentionally separate from the desktop version and tuned for a **free CPU-first Space**: | |
| - STT: `faster-whisper` on CPU | |
| - LLM: `Qwen/Qwen2.5-1.5B-Instruct` through local `transformers` | |
| - TTS: Kokoro on CPU | |
| There is now an optional STT comparison path for: | |
| - `nvidia/parakeet-tdt-0.6b-v3` through NeMo on CPU | |
| ## Important constraints | |
| - This is a **cold-starting Space** on free hardware. | |
| - It is expected to be **slower** than the local GPU version. | |
| - The local model is configured to **hide reasoning output** and answer briefly for voice use. | |
| - The first load is intentionally front-loaded at startup so the reply watchdog does not fire during model warmup. | |
| ## Runtime profile | |
| The default Space profile lives in: | |
| ```bash | |
| .env.hf-free.example | |
| ``` | |
| At startup, `start.sh` copies that file to `.env` if no `.env` already exists. | |
| To compare Whisper with Parakeet v3, change: | |
| ```bash | |
| VOICE_LAB_STT_BACKEND=parakeet-tdt-v3 | |
| ``` | |
| and keep the default Whisper values if you want to switch back quickly. | |
| You can also test Parakeet without Silero gating: | |
| ```bash | |
| VOICE_LAB_PARAKEET_BYPASS_SILERO_VAD=true | |
| ``` | |
| That keeps the app turn-taking logic but switches the speech gate to a simpler RMS-based path for the Parakeet backend only. | |
| Use the separate Parakeet barge-in knobs if assistant interruption becomes too sensitive: | |
| ```bash | |
| VOICE_LAB_PARAKEET_BARGE_IN_RMS_THRESHOLD=0.02 | |
| VOICE_LAB_PARAKEET_BARGE_IN_START_MS=180 | |
| ``` | |
| ## What this Space is for | |
| - waking on visit | |
| - testing a remote phone-friendly voice loop | |
| - keeping the desktop-only stack separate | |
| ## What this Space is not for | |
| - matching the latency of the local GPU setup | |
| - expecting Parakeet v3 on free hardware to be fast or cheap | |
| - running the desktop-only `my-agent-cli` backend | |
| ## Later upgrade path | |
| If you switch this Space to paid GPU hardware later, this copy is the place to test: | |
| - different STT backends | |
| - Parakeet variants | |
| - a different local LLM profile | |
| without disturbing the original local app. | |