voice-agent / README.md
RalphThings's picture
Deploy Hugging Face Space
a8bcb70
metadata
title: Voice Latency Lab
emoji: 🎙️
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Voice chat app with STT, Qwen 1.5B, and Kokoro.

Voice Latency Lab

This is the Hugging Face Docker Space copy of the app.

It is intentionally separate from the desktop version and tuned for a free CPU-first Space:

  • STT: faster-whisper on CPU
  • LLM: Qwen/Qwen2.5-1.5B-Instruct through local transformers
  • TTS: Kokoro on CPU

There is now an optional STT comparison path for:

  • nvidia/parakeet-tdt-0.6b-v3 through NeMo on CPU

Important constraints

  • This is a cold-starting Space on free hardware.
  • It is expected to be slower than the local GPU version.
  • The local model is configured to hide reasoning output and answer briefly for voice use.
  • The first load is intentionally front-loaded at startup so the reply watchdog does not fire during model warmup.

Runtime profile

The default Space profile lives in:

.env.hf-free.example

At startup, start.sh copies that file to .env if no .env already exists.

To compare Whisper with Parakeet v3, change:

VOICE_LAB_STT_BACKEND=parakeet-tdt-v3

and keep the default Whisper values if you want to switch back quickly.

You can also test Parakeet without Silero gating:

VOICE_LAB_PARAKEET_BYPASS_SILERO_VAD=true

That keeps the app turn-taking logic but switches the speech gate to a simpler RMS-based path for the Parakeet backend only. Use the separate Parakeet barge-in knobs if assistant interruption becomes too sensitive:

VOICE_LAB_PARAKEET_BARGE_IN_RMS_THRESHOLD=0.02
VOICE_LAB_PARAKEET_BARGE_IN_START_MS=180

What this Space is for

  • waking on visit
  • testing a remote phone-friendly voice loop
  • keeping the desktop-only stack separate

What this Space is not for

  • matching the latency of the local GPU setup
  • expecting Parakeet v3 on free hardware to be fast or cheap
  • running the desktop-only my-agent-cli backend

Later upgrade path

If you switch this Space to paid GPU hardware later, this copy is the place to test:

  • different STT backends
  • Parakeet variants
  • a different local LLM profile

without disturbing the original local app.