Spaces:

RalphThings
/

voice-agent

Sleeping

App Files Files Community

voice-agent / README.md

RalphThings

Deploy Hugging Face Space

a8bcb70 19 days ago

preview code

raw

history blame contribute delete

2.27 kB

metadata

title: Voice Latency Lab
emoji: 🎙️
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Voice chat app with STT, Qwen 1.5B, and Kokoro.

Voice Latency Lab

This is the Hugging Face Docker Space copy of the app.

It is intentionally separate from the desktop version and tuned for a free CPU-first Space:

STT: faster-whisper on CPU
LLM: Qwen/Qwen2.5-1.5B-Instruct through local transformers
TTS: Kokoro on CPU

There is now an optional STT comparison path for:

nvidia/parakeet-tdt-0.6b-v3 through NeMo on CPU

Important constraints

This is a cold-starting Space on free hardware.
It is expected to be slower than the local GPU version.
The local model is configured to hide reasoning output and answer briefly for voice use.
The first load is intentionally front-loaded at startup so the reply watchdog does not fire during model warmup.

Runtime profile

The default Space profile lives in:

.env.hf-free.example

At startup, start.sh copies that file to .env if no .env already exists.

To compare Whisper with Parakeet v3, change:

VOICE_LAB_STT_BACKEND=parakeet-tdt-v3

and keep the default Whisper values if you want to switch back quickly.

You can also test Parakeet without Silero gating:

VOICE_LAB_PARAKEET_BYPASS_SILERO_VAD=true

That keeps the app turn-taking logic but switches the speech gate to a simpler RMS-based path for the Parakeet backend only. Use the separate Parakeet barge-in knobs if assistant interruption becomes too sensitive:

VOICE_LAB_PARAKEET_BARGE_IN_RMS_THRESHOLD=0.02
VOICE_LAB_PARAKEET_BARGE_IN_START_MS=180

What this Space is for

waking on visit
testing a remote phone-friendly voice loop
keeping the desktop-only stack separate

What this Space is not for

matching the latency of the local GPU setup
expecting Parakeet v3 on free hardware to be fast or cheap
running the desktop-only my-agent-cli backend

Later upgrade path

If you switch this Space to paid GPU hardware later, this copy is the place to test:

different STT backends
Parakeet variants
a different local LLM profile

without disturbing the original local app.