Spaces:
Running
Running
Docker Deployment - Multi-Threaded Voice Agent
Overview
This Docker container runs the complete multi-threaded telco voice agent stack:
- LangGraph Server (
langgraph dev) on port 2024 - Pipecat Pipeline (FastAPI + WebRTC) on port 7860
- React UI served at
http://localhost:7860
Quick Start
Build the Image
# From project root
docker build -t voice-agent-multi-thread .
Run the Container
docker run -p 7860:7860 \
-e RIVA_API_KEY=your_nvidia_api_key \
-e NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409 \
-e NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d \
voice-agent-multi-thread
Access the UI
Open your browser to: http://localhost:7860
What Happens Inside the Container
The start.sh script orchestrates two processes:
1. LangGraph Server (Port 2024)
cd /app/examples/voice_agent_multi_thread/agents
uv run langgraph dev --no-browser --host 0.0.0.0 --port 2024
This runs the multi-threaded telco agent with:
- Main thread for long operations
- Secondary thread for interim queries
- Store-based coordination
2. Pipecat Pipeline (Port 7860)
cd /app/examples/voice_agent_multi_thread
uv run pipeline.py
This runs the voice pipeline with:
- WebRTC transport
- RIVA ASR (speech-to-text)
- LangGraphLLMService (multi-threaded routing)
- RIVA TTS (text-to-speech)
- React UI
Environment Variables
Required
# NVIDIA API Key for RIVA services
RIVA_API_KEY=nvapi-xxxxx
Optional
# LangGraph Configuration
LANGGRAPH_HOST=0.0.0.0
LANGGRAPH_PORT=2024
LANGGRAPH_ASSISTANT=telco-agent
# User Configuration
USER_EMAIL=user@example.com
# ASR Configuration
NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
RIVA_ASR_LANGUAGE=en-US
RIVA_ASR_MODEL=parakeet-1.1b-en-US-asr-streaming-silero-vad-asr-bls-ensemble
# TTS Configuration
NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
RIVA_TTS_VOICE_ID=Magpie-ZeroShot.Female-1
RIVA_TTS_MODEL=magpie_tts_ensemble-Magpie-ZeroShot
RIVA_TTS_LANGUAGE=en-US
# Zero-shot audio prompt (optional)
ZERO_SHOT_AUDIO_PROMPT_URL=https://github.com/your-repo/audio-prompt.wav
# Multi-threading (default: true)
ENABLE_MULTI_THREADING=true
# Debug
LANGGRAPH_DEBUG_STREAM=false
Docker Compose
Create docker-compose.yml:
version: '3.8'
services:
voice-agent:
build: .
ports:
- "7860:7860"
environment:
- RIVA_API_KEY=${RIVA_API_KEY}
- NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
- NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
- USER_EMAIL=user@example.com
- LANGGRAPH_ASSISTANT=telco-agent
- ENABLE_MULTI_THREADING=true
volumes:
# Optional: mount .env file
- ./examples/voice_agent_multi_thread/.env:/app/examples/voice_agent_multi_thread/.env:ro
# Optional: persist audio recordings
- ./audio_dumps:/app/examples/voice_agent_multi_thread/audio_dumps
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7860/get_prompt"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
Run with:
docker-compose up
Using .env File
Create .env in examples/voice_agent_multi_thread/:
# NVIDIA API Keys
RIVA_API_KEY=nvapi-xxxxx
# LangGraph
LANGGRAPH_ASSISTANT=telco-agent
LANGGRAPH_BASE_URL=http://127.0.0.1:2024
# User
USER_EMAIL=test@example.com
# ASR
NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
# TTS
NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
RIVA_TTS_VOICE_ID=Magpie-ZeroShot.Female-1
The start.sh script automatically loads this file.
Ports
| Service | Internal Port | External Port | Purpose |
|---|---|---|---|
| LangGraph Server | 2024 | - | Agent runtime (internal only) |
| Pipecat Pipeline | 7860 | 7860 | WebRTC + HTTP API |
| React UI | - | 7860 | Served by pipeline |
Note: Only port 7860 is exposed externally. LangGraph runs internally on 2024.
Healthcheck
The container includes a healthcheck that verifies the pipeline is responding:
curl -f http://localhost:7860/get_prompt
Check health status:
docker ps
# Look for "(healthy)" in STATUS column
Logs
View all logs:
docker logs -f <container-id>
You'll see both:
- LangGraph server startup and agent logs
- Pipeline startup and WebRTC connection logs
Testing Multi-Threading
- Open UI: http://localhost:7860
- Select Agent: Choose "Telco Agent"
- Test Long Operation:
- Say: "Close my contract"
- Confirm: "Yes"
- Operation starts (50 seconds)
- Test Secondary Thread:
- While waiting, say: "What's the status?"
- Agent responds with progress
- Say: "How much data do I have left?"
- Agent answers while main operation continues
Troubleshooting
Container won't start
# Check logs
docker logs <container-id>
# Common issues:
# 1. Missing RIVA_API_KEY
# 2. Port 7860 already in use
# 3. Insufficient memory
LangGraph not starting
# Check if agents directory exists
docker exec <container-id> ls -la /app/examples/voice_agent_multi_thread/agents
# Check langgraph.json
docker exec <container-id> cat /app/examples/voice_agent_multi_thread/agents/langgraph.json
Pipeline not responding
# Check pipeline logs
docker logs <container-id> 2>&1 | grep pipeline
# Check if port is accessible
curl http://localhost:7860/get_prompt
Multi-threading not working
# Verify env var
docker exec <container-id> env | grep MULTI_THREADING
# Check LangGraph server
docker exec <container-id> curl http://localhost:2024/assistants
Development Mode
To develop inside the container:
# Run with shell
docker run -it -p 7860:7860 \
-v $(pwd)/examples/voice_agent_multi_thread:/app/examples/voice_agent_multi_thread \
voice-agent-multi-thread /bin/bash
# Inside container:
cd /app/examples/voice_agent_multi_thread
# Start services manually
cd agents && uv run langgraph dev &
cd .. && uv run pipeline.py
Building for Production
Multi-stage optimization
The Dockerfile uses a multi-stage build:
- ui-builder: Compiles React UI
- python base: Installs Python dependencies
- Final image: ~2GB (UI + Python + agents)
Reducing image size
# Use slim Python base (already done)
FROM python:3.12-slim
# Clean up build artifacts (already done)
RUN apt-get clean && rm -rf /var/lib/apt/lists/*
# Use uv for faster installs (already done)
RUN pip install uv
Security Considerations
- Non-root user: Container runs as UID 1000
- No secrets in image: Use environment variables or mount secrets
- Read-only filesystem: UI dist is built at image time
- Health checks: Automatic restart on failure
Performance
- Startup time: ~30-60 seconds
- Memory: ~2GB recommended
- CPU: 2 cores minimum
- Storage: ~3GB for image + runtime
Related Files
Dockerfile- Container definitionstart.sh- Startup orchestrationagents/langgraph.json- Agent configurationpipeline.py- Pipecat pipelinelanggraph_llm_service.py- Multi-threaded LLM service
Support
For issues:
- Check logs:
docker logs <container-id> - Verify environment variables
- Test components individually (LangGraph, Pipeline)
- Review
PIPECAT_MULTI_THREADING.mdfor architecture details