Spaces:

RalphThings
/

voice-agent

Sleeping

App Files Files Community

voice-agent / README.md

RalphThings

Deploy Hugging Face Space

a8bcb70 20 days ago

preview code

raw

history blame contribute delete

2.27 kB

	---
	title: Voice Latency Lab
	emoji: 🎙️
	colorFrom: blue
	colorTo: indigo
	sdk: docker
	app_port: 7860
	pinned: false
	license: mit
	short_description: Voice chat app with STT, Qwen 1.5B, and Kokoro.
	---

	# Voice Latency Lab

	This is the Hugging Face Docker Space copy of the app.

	It is intentionally separate from the desktop version and tuned for a free CPU-first Space:

	- STT: `faster-whisper` on CPU
	- LLM: `Qwen/Qwen2.5-1.5B-Instruct` through local `transformers`
	- TTS: Kokoro on CPU

	There is now an optional STT comparison path for:

	- `nvidia/parakeet-tdt-0.6b-v3` through NeMo on CPU

	## Important constraints

	- This is a cold-starting Space on free hardware.
	- It is expected to be slower than the local GPU version.
	- The local model is configured to hide reasoning output and answer briefly for voice use.
	- The first load is intentionally front-loaded at startup so the reply watchdog does not fire during model warmup.

	## Runtime profile

	The default Space profile lives in:

	```bash
	.env.hf-free.example
	```

	At startup, `start.sh` copies that file to `.env` if no `.env` already exists.

	To compare Whisper with Parakeet v3, change:

	```bash
	VOICE_LAB_STT_BACKEND=parakeet-tdt-v3
	```

	and keep the default Whisper values if you want to switch back quickly.

	You can also test Parakeet without Silero gating:

	```bash
	VOICE_LAB_PARAKEET_BYPASS_SILERO_VAD=true
	```

	That keeps the app turn-taking logic but switches the speech gate to a simpler RMS-based path for the Parakeet backend only.
	Use the separate Parakeet barge-in knobs if assistant interruption becomes too sensitive:

	```bash
	VOICE_LAB_PARAKEET_BARGE_IN_RMS_THRESHOLD=0.02
	VOICE_LAB_PARAKEET_BARGE_IN_START_MS=180
	```

	## What this Space is for

	- waking on visit
	- testing a remote phone-friendly voice loop
	- keeping the desktop-only stack separate

	## What this Space is not for

	- matching the latency of the local GPU setup
	- expecting Parakeet v3 on free hardware to be fast or cheap
	- running the desktop-only `my-agent-cli` backend

	## Later upgrade path

	If you switch this Space to paid GPU hardware later, this copy is the place to test:

	- different STT backends
	- Parakeet variants
	- a different local LLM profile

	without disturbing the original local app.