File size: 7,518 Bytes
2f49513
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
# Docker Deployment - Multi-Threaded Voice Agent

## Overview

This Docker container runs the complete multi-threaded telco voice agent stack:
- **LangGraph Server** (`langgraph dev`) on port 2024
- **Pipecat Pipeline** (FastAPI + WebRTC) on port 7860
- **React UI** served at `http://localhost:7860`

## Quick Start

### Build the Image

```bash
# From project root
docker build -t voice-agent-multi-thread .
```

### Run the Container

```bash
docker run -p 7860:7860 \
  -e RIVA_API_KEY=your_nvidia_api_key \
  -e NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409 \
  -e NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d \
  voice-agent-multi-thread
```

### Access the UI

Open your browser to: **http://localhost:7860**

## What Happens Inside the Container

The `start.sh` script orchestrates two processes:

### 1. LangGraph Server (Port 2024)
```bash
cd /app/examples/voice_agent_multi_thread/agents
uv run langgraph dev --no-browser --host 0.0.0.0 --port 2024
```

This runs the multi-threaded telco agent with:
- Main thread for long operations
- Secondary thread for interim queries
- Store-based coordination

### 2. Pipecat Pipeline (Port 7860)
```bash
cd /app/examples/voice_agent_multi_thread
uv run pipeline.py
```

This runs the voice pipeline with:
- WebRTC transport
- RIVA ASR (speech-to-text)
- LangGraphLLMService (multi-threaded routing)
- RIVA TTS (text-to-speech)
- React UI

## Environment Variables

### Required

```bash
# NVIDIA API Key for RIVA services
RIVA_API_KEY=nvapi-xxxxx
```

### Optional

```bash
# LangGraph Configuration
LANGGRAPH_HOST=0.0.0.0
LANGGRAPH_PORT=2024
LANGGRAPH_ASSISTANT=telco-agent

# User Configuration
USER_EMAIL=user@example.com

# ASR Configuration
NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
RIVA_ASR_LANGUAGE=en-US
RIVA_ASR_MODEL=parakeet-1.1b-en-US-asr-streaming-silero-vad-asr-bls-ensemble

# TTS Configuration
NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
RIVA_TTS_VOICE_ID=Magpie-ZeroShot.Female-1
RIVA_TTS_MODEL=magpie_tts_ensemble-Magpie-ZeroShot
RIVA_TTS_LANGUAGE=en-US

# Zero-shot audio prompt (optional)
ZERO_SHOT_AUDIO_PROMPT_URL=https://github.com/your-repo/audio-prompt.wav

# Multi-threading (default: true)
ENABLE_MULTI_THREADING=true

# Debug
LANGGRAPH_DEBUG_STREAM=false
```

## Docker Compose

Create `docker-compose.yml`:

```yaml
version: '3.8'

services:
  voice-agent:
    build: .
    ports:
      - "7860:7860"
    environment:
      - RIVA_API_KEY=${RIVA_API_KEY}
      - NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409
      - NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
      - USER_EMAIL=user@example.com
      - LANGGRAPH_ASSISTANT=telco-agent
      - ENABLE_MULTI_THREADING=true
    volumes:
      # Optional: mount .env file
      - ./examples/voice_agent_multi_thread/.env:/app/examples/voice_agent_multi_thread/.env:ro
      # Optional: persist audio recordings
      - ./audio_dumps:/app/examples/voice_agent_multi_thread/audio_dumps
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7860/get_prompt"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s
```

Run with:
```bash
docker-compose up
```

## Using .env File

Create `.env` in `examples/voice_agent_multi_thread/`:

```bash
# NVIDIA API Keys
RIVA_API_KEY=nvapi-xxxxx

# LangGraph
LANGGRAPH_ASSISTANT=telco-agent
LANGGRAPH_BASE_URL=http://127.0.0.1:2024

# User
USER_EMAIL=test@example.com

# ASR
NVIDIA_ASR_FUNCTION_ID=52b117d2-6c15-4cfa-a905-a67013bee409

# TTS
NVIDIA_TTS_FUNCTION_ID=4e813649-d5e4-4020-b2be-2b918396d19d
RIVA_TTS_VOICE_ID=Magpie-ZeroShot.Female-1
```

The `start.sh` script automatically loads this file.

## Ports

| Service | Internal Port | External Port | Purpose |
|---------|---------------|---------------|---------|
| LangGraph Server | 2024 | - | Agent runtime (internal only) |
| Pipecat Pipeline | 7860 | 7860 | WebRTC + HTTP API |
| React UI | - | 7860 | Served by pipeline |

**Note**: Only port 7860 is exposed externally. LangGraph runs internally on 2024.

## Healthcheck

The container includes a healthcheck that verifies the pipeline is responding:

```bash
curl -f http://localhost:7860/get_prompt
```

Check health status:
```bash
docker ps
# Look for "(healthy)" in STATUS column
```

## Logs

View all logs:
```bash
docker logs -f <container-id>
```

You'll see both:
- LangGraph server startup and agent logs
- Pipeline startup and WebRTC connection logs

## Testing Multi-Threading

1. **Open UI**: http://localhost:7860
2. **Select Agent**: Choose "Telco Agent"
3. **Test Long Operation**:
   - Say: *"Close my contract"*
   - Confirm: *"Yes"*
   - Operation starts (50 seconds)
4. **Test Secondary Thread**:
   - While waiting, say: *"What's the status?"*
   - Agent responds with progress
   - Say: *"How much data do I have left?"*
   - Agent answers while main operation continues

## Troubleshooting

### Container won't start
```bash
# Check logs
docker logs <container-id>

# Common issues:
# 1. Missing RIVA_API_KEY
# 2. Port 7860 already in use
# 3. Insufficient memory
```

### LangGraph not starting
```bash
# Check if agents directory exists
docker exec <container-id> ls -la /app/examples/voice_agent_multi_thread/agents

# Check langgraph.json
docker exec <container-id> cat /app/examples/voice_agent_multi_thread/agents/langgraph.json
```

### Pipeline not responding
```bash
# Check pipeline logs
docker logs <container-id> 2>&1 | grep pipeline

# Check if port is accessible
curl http://localhost:7860/get_prompt
```

### Multi-threading not working
```bash
# Verify env var
docker exec <container-id> env | grep MULTI_THREADING

# Check LangGraph server
docker exec <container-id> curl http://localhost:2024/assistants
```

## Development Mode

To develop inside the container:

```bash
# Run with shell
docker run -it -p 7860:7860 \
  -v $(pwd)/examples/voice_agent_multi_thread:/app/examples/voice_agent_multi_thread \
  voice-agent-multi-thread /bin/bash

# Inside container:
cd /app/examples/voice_agent_multi_thread

# Start services manually
cd agents && uv run langgraph dev &
cd .. && uv run pipeline.py
```

## Building for Production

### Multi-stage optimization
The Dockerfile uses a multi-stage build:
1. **ui-builder**: Compiles React UI
2. **python base**: Installs Python dependencies
3. **Final image**: ~2GB (UI + Python + agents)

### Reducing image size
```dockerfile
# Use slim Python base (already done)
FROM python:3.12-slim

# Clean up build artifacts (already done)
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

# Use uv for faster installs (already done)
RUN pip install uv
```

## Security Considerations

1. **Non-root user**: Container runs as UID 1000
2. **No secrets in image**: Use environment variables or mount secrets
3. **Read-only filesystem**: UI dist is built at image time
4. **Health checks**: Automatic restart on failure

## Performance

- **Startup time**: ~30-60 seconds
- **Memory**: ~2GB recommended
- **CPU**: 2 cores minimum
- **Storage**: ~3GB for image + runtime

## Related Files

- `Dockerfile` - Container definition
- `start.sh` - Startup orchestration
- `agents/langgraph.json` - Agent configuration
- `pipeline.py` - Pipecat pipeline
- `langgraph_llm_service.py` - Multi-threaded LLM service

## Support

For issues:
1. Check logs: `docker logs <container-id>`
2. Verify environment variables
3. Test components individually (LangGraph, Pipeline)
4. Review `PIPECAT_MULTI_THREADING.md` for architecture details