Abeshith commited on
Commit
1813edc
·
0 Parent(s):

Voice BOT RAG Initial Commit

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitignore +2 -0
  2. README.md +389 -0
  3. START_SYSTEM.ps1 +64 -0
  4. backend/__init__.py +1 -0
  5. backend/__pycache__/__init__.cpython-311.pyc +0 -0
  6. backend/__pycache__/config.cpython-311.pyc +0 -0
  7. backend/__pycache__/main.cpython-311.pyc +0 -0
  8. backend/__pycache__/voice_bot_controller.cpython-311.pyc +0 -0
  9. backend/config.py +68 -0
  10. backend/main.py +241 -0
  11. backend/voice_bot_controller.py +134 -0
  12. data/latency_results.json +142 -0
  13. data/load_sample_data.py +201 -0
  14. data/sessions.db +0 -0
  15. data/test_latency.json +22 -0
  16. frontend/__pycache__/streamlit_app.cpython-311.pyc +0 -0
  17. frontend/streamlit_app.py +739 -0
  18. orchestration/__init__.py +1 -0
  19. orchestration/__pycache__/__init__.cpython-311.pyc +0 -0
  20. orchestration/__pycache__/langgraph_workflow.cpython-311.pyc +0 -0
  21. orchestration/__pycache__/latency_tracker.cpython-311.pyc +0 -0
  22. orchestration/__pycache__/state.cpython-311.pyc +0 -0
  23. orchestration/langgraph_workflow.py +119 -0
  24. orchestration/latency_tracker.py +123 -0
  25. orchestration/nodes/__init__.py +1 -0
  26. orchestration/nodes/__pycache__/__init__.cpython-311.pyc +0 -0
  27. orchestration/nodes/__pycache__/context_builder.cpython-311.pyc +0 -0
  28. orchestration/nodes/__pycache__/entity_extraction.cpython-311.pyc +0 -0
  29. orchestration/nodes/__pycache__/intent_detection.cpython-311.pyc +0 -0
  30. orchestration/nodes/__pycache__/memory_persistence.cpython-311.pyc +0 -0
  31. orchestration/nodes/__pycache__/response_generation.cpython-311.pyc +0 -0
  32. orchestration/nodes/__pycache__/retrieval_router.cpython-311.pyc +0 -0
  33. orchestration/nodes/__pycache__/sentiment_analysis.cpython-311.pyc +0 -0
  34. orchestration/nodes/__pycache__/sentiment_hybrid.cpython-311.pyc +0 -0
  35. orchestration/nodes/__pycache__/tts_generation.cpython-311.pyc +0 -0
  36. orchestration/nodes/__pycache__/validation.cpython-311.pyc +0 -0
  37. orchestration/nodes/context_builder.py +60 -0
  38. orchestration/nodes/entity_extraction.py +58 -0
  39. orchestration/nodes/intent_detection.py +61 -0
  40. orchestration/nodes/memory_persistence.py +45 -0
  41. orchestration/nodes/response_generation.py +93 -0
  42. orchestration/nodes/retrieval_router.py +51 -0
  43. orchestration/nodes/sentiment_analysis.py +49 -0
  44. orchestration/nodes/sentiment_hybrid.py +133 -0
  45. orchestration/nodes/tts_generation.py +72 -0
  46. orchestration/nodes/validation.py +61 -0
  47. orchestration/state.py +67 -0
  48. rag/__init__.py +1 -0
  49. rag/__pycache__/__init__.cpython-311.pyc +0 -0
  50. rag/__pycache__/cache_manager.cpython-311.pyc +0 -0
.gitignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ venv/
2
+ .env
README.md ADDED
@@ -0,0 +1,389 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Voice RAG Bot - AI Customer Support System
2
+
3
+ **Status**: ✅ **FULLY FUNCTIONAL** | Latest Update: May 30, 2026
4
+
5
+ ## 📋 Quick Overview
6
+
7
+ Voice RAG Bot is an intelligent AI customer support system that:
8
+ - 🎤 **Accepts voice input** via microphone or audio file upload
9
+ - 🧠 **Processes with LLM** (Groq) for intent detection and response generation
10
+ - 📚 **Retrieves relevant context** from knowledge base and customer history using vector search
11
+ - 😊 **Analyzes sentiment** to provide empathetic, sentiment-aware responses
12
+ - 🔊 **Generates speech output** via text-to-speech
13
+ - 📊 **Orchestrates 9-node workflow** using LangGraph
14
+
15
+ **Tech Stack**: Faster Whisper (STT) → LangGraph (9 nodes) → Groq LLM → Qdrant (Vector DB) → gTTS (TTS)
16
+
17
+ ---
18
+
19
+ ## 🚀 Quick Start (3 Steps)
20
+
21
+ ### Step 1: Prerequisites
22
+ - Docker Desktop running (for Qdrant)
23
+ - Python 3.11+
24
+ - Git (optional)
25
+
26
+ ### Step 2: Start Qdrant (Vector Database)
27
+ ```bash
28
+ docker run -p 6333:6333 qdrant/qdrant:latest
29
+ ```
30
+ Leave this running in background. ✅ System will auto-create collections.
31
+
32
+ ### Step 3: Start Voice RAG Bot
33
+ ```bash
34
+ cd d:\Voice RAG Bot\voice-rag-bot
35
+
36
+ # Activate virtual environment
37
+ .\venv\Scripts\Activate.ps1
38
+
39
+ # Run startup script (starts backend + Streamlit)
40
+ .\START_SYSTEM.ps1
41
+ ```
42
+
43
+ **Or start services manually:**
44
+
45
+ Terminal 1 (Backend):
46
+ ```bash
47
+ .\venv\Scripts\Activate.ps1
48
+ python backend/main.py
49
+ # Runs on http://localhost:8000
50
+ ```
51
+
52
+ Terminal 2 (Frontend):
53
+ ```bash
54
+ .\venv\Scripts\Activate.ps1
55
+ streamlit run frontend/streamlit_app.py
56
+ # Opens http://localhost:8501
57
+ ```
58
+
59
+ ---
60
+
61
+ ## 📖 Usage Guide
62
+
63
+ ### Via Streamlit Frontend (Recommended)
64
+
65
+ 1. **Open Browser**: http://localhost:8501
66
+ 2. **Enter Customer ID**: Unique identifier for customer (enables history tracking)
67
+ 3. **Choose Input Method**:
68
+ - **Option A**: Click 🎤 **Record** → Speak your message → **Process Audio**
69
+ - **Option B**: Upload audio file (MP3/WAV)
70
+ - **Option C**: Type message directly in text area
71
+ 4. **View Results** (automatically displayed):
72
+ - 📝 Generated Response
73
+ - 🎯 Detected Intent (+ confidence)
74
+ - 😊 Sentiment Analysis (+ confidence)
75
+ - 🏷️ Extracted Entities
76
+ - 📚 Knowledge Base context (if relevant)
77
+ - 📜 Customer History (if relevant)
78
+ - 🔊 Audio playback of response
79
+
80
+ ### Via REST API (For Integration)
81
+
82
+ **Process Audio:**
83
+ ```bash
84
+ curl -X POST "http://localhost:8000/process-audio?customer_id=CUST_001" \
85
+ -F "file=@voice_message.wav"
86
+ ```
87
+
88
+ **Process Text:**
89
+ ```bash
90
+ curl -X POST "http://localhost:8000/process-text" \
91
+ -d "user_input=I want to return my laptop&customer_id=CUST_001"
92
+ ```
93
+
94
+ **Health Check:**
95
+ ```bash
96
+ curl http://localhost:8000/health
97
+ ```
98
+
99
+ ---
100
+
101
+ ## 📊 System Architecture
102
+
103
+ ```
104
+ Input Layer
105
+ ├─ 🎤 Audio Input (Streamlit st.audio_input)
106
+ └─ 📝 Text Input (Streamlit text area)
107
+
108
+ Speech-to-Text
109
+ └─ Faster Whisper (base model, CPU inference)
110
+
111
+ Orchestration Layer (LangGraph - 9 Nodes)
112
+ 1. sentiment_analysis (DistilBERT)
113
+ 2. entity_extraction (BERT-base-NER)
114
+ 3. intent_detection (Groq LLM)
115
+ 4. retrieval_router (Qdrant search)
116
+ 5. context_builder (Format prompt)
117
+ 6. response_generation (Groq LLM)
118
+ 7. validation (Hallucination checks)
119
+ 8. memory_persistence (Qdrant upsert)
120
+ 9. tts_generation (gTTS)
121
+
122
+ Output Layer
123
+ ├─ 📝 Text Response
124
+ ├─ 😊 Sentiment-aware Tone
125
+ ├─ 🔊 Audio File (MP3)
126
+ └─ 🎯 Intent Classification
127
+ ```
128
+
129
+ ---
130
+
131
+ ## 🔧 Configuration
132
+
133
+ **Environment Variables** (`.env`):
134
+ ```
135
+ GROQ_API_KEY=your_groq_api_key_here
136
+ QDRANT_URL=http://localhost:6333
137
+ BACKEND_URL=http://localhost:8000
138
+ VECTOR_DIMENSION=1024
139
+ EMBEDDING_MODEL=BAAI/bge-m3
140
+ GROQ_MODEL=openai/gpt-oss-20b
141
+ KB_COLLECTION_NAME=knowledge_base
142
+ HISTORY_COLLECTION_NAME=customer_history
143
+ WHISPER_MODEL=base
144
+ ```
145
+
146
+ ---
147
+
148
+ ## 📝 Sample Data
149
+
150
+ Load sample data (4 KB documents + 4 customer history records):
151
+ ```bash
152
+ .\venv\Scripts\Activate.ps1
153
+ python data/load_sample_data.py
154
+ ```
155
+
156
+ **Included Data:**
157
+ - KB Documents: Return Policy, Shipping Info, Warranty Info, Account Management
158
+ - Customer History: 4 interactions (complaints, refunds, inquiries)
159
+
160
+ ---
161
+
162
+ ## 🧪 Testing
163
+
164
+ ### Quick Verification
165
+ ```bash
166
+ # Test complete pipeline (end-to-end)
167
+ .\venv\Scripts\Activate.ps1
168
+ python tests/test_full_integration.py
169
+ ```
170
+
171
+ **Expected Output**: ✅ FULL INTEGRATION TEST PASSED
172
+
173
+ ### Component Status
174
+ - ✅ All 9 nodes connected and working
175
+ - ✅ FastAPI endpoints operational
176
+ - ✅ Qdrant vector search functional
177
+ - ✅ LLM integration responding
178
+ - ✅ Audio processing working
179
+ - ✅ Sample data loadable
180
+
181
+ ---
182
+
183
+ ## 🎯 Intent Types Supported
184
+
185
+ | Intent | Example | Response |
186
+ |--------|---------|----------|
187
+ | `refund_request` | "I want to return this" | Empathetic, processing info |
188
+ | `order_status` | "Where's my order?" | Tracking info |
189
+ | `product_inquiry` | "Tell me about...?" | Product details |
190
+ | `billing_issue` | "My charge was wrong" | Empathetic, billing process |
191
+ | `warranty_claim` | "Product broke" | Warranty eligibility info |
192
+ | `account_management` | "Change my password" | Account instructions |
193
+ | `general_support` | "How do I...?" | General assistance |
194
+ | `complaint` | "This is unacceptable" | Empathetic, resolution steps |
195
+ | `other` | Misc questions | General help |
196
+
197
+ ---
198
+
199
+ ## 📊 Response Quality Factors
200
+
201
+ 1. **Sentiment Detection**: POSITIVE/NEGATIVE/NEUTRAL classification
202
+ 2. **Confidence Scores**: 0-1 for both intent and sentiment
203
+ 3. **Context Retrieval**: Up to 3 KB documents + customer history
204
+ 4. **Tone Matching**: Empathetic for negative, professional for neutral, friendly for positive
205
+ 5. **Hallucination Prevention**: Validation layer checks for accuracy
206
+
207
+ ---
208
+
209
+ ## 🐛 Troubleshooting
210
+
211
+ ### Issue: "Backend Not Connected"
212
+ **Solution**: Ensure FastAPI backend is running
213
+ ```bash
214
+ python backend/main.py
215
+ ```
216
+
217
+ ### Issue: "Qdrant Connection Error"
218
+ **Solution**: Start Qdrant Docker container
219
+ ```bash
220
+ docker run -p 6333:6333 qdrant/qdrant:latest
221
+ ```
222
+
223
+ ### Issue: "Groq API Error"
224
+ **Solution**: Check GROQ_API_KEY in `.env` file
225
+ ```bash
226
+ # Verify key is set
227
+ echo $env:GROQ_API_KEY
228
+ ```
229
+
230
+ ### Issue: "Audio Processing Timeout"
231
+ **Solution**: Processing may take 30-60 seconds for audio
232
+ - First run downloads models (Whisper, BGE-M3, DistilBERT)
233
+ - Subsequent runs are faster
234
+ - Ensure sufficient disk space (~5GB)
235
+
236
+ ### Issue: "Module Not Found"
237
+ **Solution**: Reinstall dependencies
238
+ ```bash
239
+ .\venv\Scripts\Activate.ps1
240
+ pip install -r requirements.txt
241
+ ```
242
+
243
+ ---
244
+
245
+ ## 📁 Project Structure
246
+
247
+ ```
248
+ d:\Voice RAG Bot\voice-rag-bot\
249
+ ├── backend/
250
+ │ ├── main.py FastAPI server
251
+ │ └── config.py Configuration
252
+ ├── frontend/
253
+ │ └── streamlit_app.py Web UI
254
+ ├── orchestration/
255
+ │ ├── langgraph_workflow.py 9-node workflow
256
+ │ ├── state.py State management
257
+ │ └── nodes/ Individual nodes
258
+ │ ├── sentiment_analysis.py
259
+ │ ├── entity_extraction.py
260
+ │ ├── intent_detection.py
261
+ │ ├── retrieval_router.py
262
+ │ ├── context_builder.py
263
+ │ ├── response_generation.py
264
+ │ ├── validation.py
265
+ │ ├── memory_persistence.py
266
+ │ └── tts_generation.py
267
+ ├── rag/
268
+ │ ├── qdrant_manager.py Vector DB client
269
+ │ └── embedding_manager.py BGE-M3 embeddings
270
+ ├── data/
271
+ │ ├── load_sample_data.py Sample data loader
272
+ │ └── audio_output/ Generated audio files
273
+ ├── tests/
274
+ │ └── test_full_integration.py End-to-end test
275
+ ├── .env Configuration
276
+ ├── requirements.txt Dependencies
277
+ ├── START_SYSTEM.ps1 Quick start script
278
+ └── venv/ Python environment
279
+ ```
280
+
281
+ ---
282
+
283
+ ## 🔄 Workflow Execution (Behind the Scenes)
284
+
285
+ 1. **sentiment_analysis**: Input → DistilBERT → POSITIVE/NEGATIVE/NEUTRAL
286
+ 2. **entity_extraction**: Input → BERT-NER → Extract names, locations, etc.
287
+ 3. **intent_detection**: Input → Groq LLM → 9-intent classification
288
+ 4. **retrieval_router**: Intent → Qdrant search → 3 KB docs + customer history
289
+ 5. **context_builder**: Format contexts → Unified prompt
290
+ 6. **response_generation**: Prompt → Groq LLM → Response text
291
+ 7. **validation**: Check hallucinations → Retry if needed
292
+ 8. **memory_persistence**: Embed response → Upsert to Qdrant
293
+ 9. **tts_generation**: Response text → gTTS → MP3 audio file
294
+
295
+ ---
296
+
297
+ ## 📊 Performance Metrics (Approximate)
298
+
299
+ | Component | Time | Notes |
300
+ |-----------|------|-------|
301
+ | STT (Audio → Text) | 5-15s | Depends on audio length |
302
+ | Sentiment Analysis | 0.5s | DistilBERT inference |
303
+ | Entity Extraction | 0.5s | BERT-NER inference |
304
+ | Intent Detection | 1-2s | Groq API call |
305
+ | KB Search | 0.2s | Qdrant vector search |
306
+ | Response Generation | 2-5s | Groq streaming |
307
+ | Validation | 0.5s | Local checks |
308
+ | TTS Generation | 2-5s | gTTS processing |
309
+ | **Total End-to-End** | **12-35s** | First run slower (model loading) |
310
+
311
+ ---
312
+
313
+ ## 💡 Tips & Tricks
314
+
315
+ ### Faster Processing
316
+ - Use text input instead of audio (skips STT)
317
+ - System caches models after first run
318
+ - Keep audio messages under 30 seconds
319
+
320
+ ### Better Responses
321
+ - Use clear, grammatically correct input
322
+ - Provide context ("purchased last week" vs "bought before")
323
+ - Specify what you need (return, refund, replacement)
324
+
325
+ ### Debugging
326
+ - Check `backend/main.py` logs for errors
327
+ - View Qdrant collections: http://localhost:6333/api/swagger/index.html
328
+ - Monitor Streamlit server in terminal for issues
329
+
330
+ ---
331
+
332
+ ## 🚀 Next Steps
333
+
334
+ 1. **Load Sample Data**: `python data/load_sample_data.py`
335
+ 2. **Test with Demo Scenarios**: Use Streamlit to test various intents
336
+ 3. **Customize KB Documents**: Add your own documents to Qdrant
337
+ 4. **Fine-tune Prompts**: Edit prompts in `prompts/` directory
338
+ 5. **Production Deployment**: Add authentication, rate limiting, monitoring
339
+
340
+ ---
341
+
342
+ ## 📞 Support & References
343
+
344
+ **Documentation Files:**
345
+ - `data/DATA_REQUIREMENTS.md` - Data schema documentation
346
+ - `.env` - Environment configuration
347
+
348
+ **API Endpoints:**
349
+ - `POST /process-audio` - Audio input endpoint
350
+ - `POST /process-text` - Text input endpoint
351
+ - `GET /health` - Health check
352
+
353
+ **Backend Logs:**
354
+ - Location: Console output when running `python backend/main.py`
355
+ - Check for errors, model loading, API calls
356
+
357
+ ---
358
+
359
+ ## 📝 License & Attribution
360
+
361
+ **Components**:
362
+ - **Groq LLM**: Free tier, gpt-oss-20b model
363
+ - **Faster Whisper**: OpenAI (MIT License)
364
+ - **LangGraph**: LangChain (Open Source)
365
+ - **Qdrant**: Open source vector database
366
+ - **BGE-M3**: BAAI embeddings model
367
+ - **DistilBERT**: Hugging Face transformers
368
+ - **gTTS**: Google Text-to-Speech
369
+
370
+ ---
371
+
372
+ ## ✅ Verification Checklist
373
+
374
+ Before considering system "ready for production":
375
+
376
+ - [ ] Backend running on http://localhost:8000
377
+ - [ ] Qdrant running on http://localhost:6333
378
+ - [ ] Streamlit frontend accessible at http://localhost:8501
379
+ - [ ] Sample data loaded (`python data/load_sample_data.py`)
380
+ - [ ] Integration test passing (`python tests/test_full_integration.py`)
381
+ - [ ] Audio input working (record or upload)
382
+ - [ ] All 9 nodes executing (check logs)
383
+ - [ ] Response generation working
384
+ - [ ] Audio playback working
385
+ - [ ] History tracking working (multiple messages same customer)
386
+
387
+ ---
388
+
389
+ **Built with ❤️ | Last Updated: May 30, 2026**
START_SYSTEM.ps1 ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Voice RAG Bot - System Startup Script
2
+ # Starts FastAPI backend and Streamlit frontend
3
+
4
+ Write-Host "=================================="
5
+ Write-Host "Voice RAG Bot - System Startup"
6
+ Write-Host "=================================="
7
+ Write-Host ""
8
+
9
+ # Check if venv exists
10
+ if (-not (Test-Path "venv\Scripts\Activate.ps1")) {
11
+ Write-Host "ERROR: Virtual environment not found!"
12
+ Write-Host "Please run: python -m venv venv"
13
+ exit 1
14
+ }
15
+
16
+ # Activate venv
17
+ Write-Host "[1/3] Activating virtual environment..."
18
+ & .\venv\Scripts\Activate.ps1
19
+
20
+ # Check if Qdrant is running
21
+ Write-Host "[2/3] Checking Qdrant connection..."
22
+ try {
23
+ $response = Invoke-WebRequest -Uri "http://localhost:6333/health" -UseBasicParsing -TimeoutSec 2
24
+ if ($response.StatusCode -eq 200) {
25
+ Write-Host "✅ Qdrant is running on localhost:6333"
26
+ }
27
+ } catch {
28
+ Write-Host "⚠️ WARNING: Cannot connect to Qdrant on localhost:6333"
29
+ Write-Host " Make sure Docker is running and Qdrant container is active"
30
+ Write-Host " Run: docker run -p 6333:6333 qdrant/qdrant:latest"
31
+ }
32
+
33
+ Write-Host ""
34
+ Write-Host "[3/3] Starting services..."
35
+ Write-Host ""
36
+
37
+ # Start backend in a separate process
38
+ Write-Host "Starting FastAPI backend on http://localhost:8000"
39
+ $backendProcess = Start-Process -NoNewWindow -FilePath "python" -ArgumentList "backend/main.py" -PassThru
40
+ Start-Sleep -Seconds 3
41
+
42
+ # Start Streamlit
43
+ Write-Host "Starting Streamlit frontend on http://localhost:8501"
44
+ Write-Host ""
45
+ Write-Host "=================================="
46
+ Write-Host "Services started successfully!"
47
+ Write-Host "=================================="
48
+ Write-Host ""
49
+ Write-Host "Frontend URL: http://localhost:8501"
50
+ Write-Host "Backend API: http://localhost:8000"
51
+ Write-Host ""
52
+ Write-Host "Backend PID: $($backendProcess.Id)"
53
+ Write-Host ""
54
+ Write-Host "To stop the backend, run: Stop-Process -Id $($backendProcess.Id)"
55
+ Write-Host ""
56
+
57
+ # Start Streamlit
58
+ python -m streamlit run frontend/streamlit_app.py
59
+
60
+ # Cleanup
61
+ Write-Host ""
62
+ Write-Host "Stopping backend..."
63
+ Stop-Process -Id $backendProcess.Id -Force
64
+ Write-Host "Shutdown complete."
backend/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Voice RAG Bot Backend Package"""
backend/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (202 Bytes). View file
 
backend/__pycache__/config.cpython-311.pyc ADDED
Binary file (2.91 kB). View file
 
backend/__pycache__/main.cpython-311.pyc ADDED
Binary file (19.2 kB). View file
 
backend/__pycache__/voice_bot_controller.cpython-311.pyc ADDED
Binary file (7.9 kB). View file
 
backend/config.py ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Central Configuration Management using Pydantic Settings
3
+ Loads environment variables from .env file
4
+ """
5
+
6
+ from pydantic_settings import BaseSettings
7
+ from typing import Optional
8
+ from pathlib import Path
9
+
10
+
11
+ class Settings(BaseSettings):
12
+ """Application configuration loaded from environment variables"""
13
+
14
+ # Groq LLM Configuration
15
+ groq_api_key: str
16
+ groq_model: str = "llama-3.3-70b-versatile"
17
+ groq_temperature: float = 0.7
18
+ groq_max_tokens: int = 1024
19
+
20
+ # Qdrant Vector Database Configuration
21
+ qdrant_url: str = "http://localhost:6333"
22
+ qdrant_api_key: Optional[str] = None # Optional for local Docker setup
23
+
24
+ # Embedding Model Configuration
25
+ embedding_model: str = "BAAI/bge-m3"
26
+ embedding_batch_size: int = 32
27
+
28
+ # Collection Names
29
+ kb_collection_name: str = "knowledge_base"
30
+ history_collection_name: str = "customer_history"
31
+
32
+ # Vector Dimensions (BGE-M3 uses 1024 dimensions)
33
+ vector_dimension: int = 1024
34
+
35
+ # Model Configuration for NLP Tasks
36
+ sentiment_model: str = "distilbert-base-uncased-finetuned-sst-2-english"
37
+
38
+ # Application Configuration
39
+ app_name: str = "Voice RAG Bot"
40
+ app_version: str = "1.0.0"
41
+ debug_mode: bool = False
42
+
43
+ # Conversation Memory
44
+ max_conversation_history: int = 10
45
+ summary_interval: int = 5 # Generate summary every 5 turns
46
+
47
+ # Audio Configuration
48
+ sample_rate: int = 16000 # 16kHz for Whisper
49
+ audio_format: str = "wav"
50
+
51
+ class Config:
52
+ """Pydantic config for reading from .env file"""
53
+ env_file = str(Path(__file__).parent.parent / ".env")
54
+ case_sensitive = False
55
+ extra = "ignore" # Ignore unknown fields from .env
56
+
57
+ def __repr__(self) -> str:
58
+ """String representation (hides API keys)"""
59
+ return (
60
+ f"Settings("
61
+ f"groq_model={self.groq_model}, "
62
+ f"qdrant_url={self.qdrant_url}, "
63
+ f"embedding_model={self.embedding_model})"
64
+ )
65
+
66
+
67
+ # Global settings instance
68
+ settings = Settings()
backend/main.py ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ FastAPI Backend for Voice RAG Bot
3
+ Handles audio input, STT conversion, workflow orchestration, and response generation
4
+ """
5
+
6
+ import logging
7
+ import asyncio
8
+ import sys
9
+ from pathlib import Path
10
+ from typing import Optional
11
+ from io import BytesIO
12
+
13
+ # Add project root to path for imports
14
+ sys.path.insert(0, str(Path(__file__).parent.parent))
15
+
16
+ from fastapi import FastAPI, UploadFile, File, HTTPException
17
+ from fastapi.middleware.cors import CORSMiddleware
18
+ from pydantic import BaseModel
19
+ import uvicorn
20
+
21
+ # Import configuration
22
+ from backend.config import settings
23
+
24
+ # Import workflow
25
+ from orchestration.langgraph_workflow import run_workflow
26
+ from orchestration.latency_tracker import get_tracker, reset_tracker
27
+
28
+ # Import STT (Faster Whisper)
29
+ from faster_whisper import WhisperModel
30
+
31
+ # Configure logging
32
+ logging.basicConfig(level=logging.INFO)
33
+ logger = logging.getLogger(__name__)
34
+
35
+ # ============================================================================
36
+ # MODELS
37
+ # ============================================================================
38
+
39
+ class ProcessAudioResponse(BaseModel):
40
+ """Response model for audio processing"""
41
+ response_text: str
42
+ audio_path: Optional[str]
43
+ intent: dict
44
+ sentiment: dict
45
+ entities: Optional[dict]
46
+ kb_context: str
47
+ history_context: str
48
+
49
+
50
+ class HealthResponse(BaseModel):
51
+ """Health check response"""
52
+ status: str
53
+ llm_model: str
54
+ qdrant_url: str
55
+ whisper_model: str
56
+
57
+
58
+ # ============================================================================
59
+ # FASTAPI APP INITIALIZATION
60
+ # ============================================================================
61
+
62
+ app = FastAPI(
63
+ title="Voice RAG Bot Backend",
64
+ description="AI-powered customer service bot with RAG and voice interface",
65
+ version="1.0.0"
66
+ )
67
+
68
+ # Add CORS middleware for frontend communication
69
+ app.add_middleware(
70
+ CORSMiddleware,
71
+ allow_origins=["*"],
72
+ allow_credentials=True,
73
+ allow_methods=["*"],
74
+ allow_headers=["*"],
75
+ )
76
+
77
+ # ============================================================================
78
+ # GLOBAL STATE
79
+ # ============================================================================
80
+
81
+ whisper_model = WhisperModel("base", device="cpu", compute_type="int8")
82
+
83
+ def extract_audio_content(audio_bytes: bytes) -> str:
84
+ try:
85
+ audio_file = BytesIO(audio_bytes)
86
+ segments, _ = whisper_model.transcribe(audio_file, language="en")
87
+ transcribed_text = " ".join([segment.text for segment in segments])
88
+
89
+ if not transcribed_text.strip():
90
+ return "No speech detected"
91
+
92
+ tracker = get_tracker()
93
+ tracker.start("whisper_stt")
94
+ tracker.end("whisper_stt")
95
+ return transcribed_text
96
+
97
+ except Exception as e:
98
+ logger.error(f"STT Error: {str(e)}")
99
+ raise HTTPException(status_code=400, detail=f"STT failed: {str(e)}")
100
+
101
+
102
+ async def run_workflow_async(user_input: str, customer_id: str) -> dict:
103
+ try:
104
+ return await run_workflow(user_input, customer_id)
105
+ except Exception as e:
106
+ logger.error(f"Workflow Error: {str(e)}")
107
+ raise HTTPException(status_code=500, detail=f"Workflow failed: {str(e)}")
108
+
109
+
110
+ @app.get("/health", response_model=HealthResponse)
111
+ async def health_check():
112
+ return {
113
+ "status": "healthy",
114
+ "llm_model": settings.groq_model,
115
+ "qdrant_url": settings.qdrant_url,
116
+ "whisper_model": "base"
117
+ }
118
+
119
+
120
+ @app.post("/process-audio", response_model=ProcessAudioResponse)
121
+ async def process_audio(
122
+ file: UploadFile = File(...),
123
+ customer_id: str = "DEFAULT_CUSTOMER"
124
+ ):
125
+ try:
126
+ reset_tracker()
127
+ tracker = get_tracker()
128
+ tracker.start_total()
129
+
130
+ audio_bytes = await file.read()
131
+ user_input = extract_audio_content(audio_bytes)
132
+ final_state = await run_workflow_async(user_input, customer_id)
133
+
134
+ response = ProcessAudioResponse(
135
+ response_text=final_state.get("response", ""),
136
+ audio_path=final_state.get("final_audio_path"),
137
+ intent=final_state.get("intent", {}),
138
+ sentiment=final_state.get("sentiment", {}),
139
+ entities=final_state.get("entities"),
140
+ kb_context=final_state.get("kb_context", ""),
141
+ history_context=final_state.get("history_context", "")
142
+ )
143
+
144
+ return response
145
+
146
+ except HTTPException:
147
+ raise
148
+ except Exception as e:
149
+ logger.error(f"Unexpected error: {str(e)}", exc_info=True)
150
+ raise HTTPException(status_code=500, detail=f"Processing failed: {str(e)}")
151
+
152
+
153
+ @app.post("/process-text")
154
+ async def process_text(
155
+ user_input: str,
156
+ customer_id: str = "DEFAULT_CUSTOMER"
157
+ ):
158
+ try:
159
+ final_state = await run_workflow_async(user_input, customer_id)
160
+
161
+ return ProcessAudioResponse(
162
+ response_text=final_state.get("response", ""),
163
+ audio_path=final_state.get("final_audio_path"),
164
+ intent=final_state.get("intent", {}),
165
+ sentiment=final_state.get("sentiment", {}),
166
+ entities=final_state.get("entities"),
167
+ kb_context=final_state.get("kb_context", ""),
168
+ history_context=final_state.get("history_context", "")
169
+ )
170
+ except Exception as e:
171
+ logger.error(f"Error: {str(e)}", exc_info=True)
172
+ raise HTTPException(status_code=500, detail=f"Processing failed: {str(e)}")
173
+
174
+
175
+ @app.get("/")
176
+ async def root():
177
+ return {
178
+ "name": "Voice RAG Bot Backend",
179
+ "version": "1.0.0",
180
+ "endpoints": {
181
+ "health": "GET /health",
182
+ "process_audio": "POST /process-audio (requires audio file)",
183
+ "process_text": "POST /process-text (requires text input)",
184
+ "voice_bot_start": "POST /voice-bot/start",
185
+ "voice_bot_message": "POST /voice-bot/message",
186
+ "voice_bot_end": "POST /voice-bot/end",
187
+ "docs": "GET /docs (Swagger UI)"
188
+ }
189
+ }
190
+
191
+
192
+ from backend.voice_bot_controller import get_voice_bot_controller
193
+
194
+ @app.post("/voice-bot/start")
195
+ async def voice_bot_start(customer_id: str = "CUST_DEFAULT"):
196
+ try:
197
+ controller = get_voice_bot_controller()
198
+ return await controller.start_session(customer_id)
199
+ except Exception as e:
200
+ raise HTTPException(status_code=500, detail=str(e))
201
+
202
+ @app.post("/voice-bot/message")
203
+ async def voice_bot_message(user_message: str):
204
+ try:
205
+ controller = get_voice_bot_controller()
206
+ return await controller.process_user_message(user_message)
207
+ except Exception as e:
208
+ raise HTTPException(status_code=500, detail=str(e))
209
+
210
+ @app.post("/voice-bot/end")
211
+ async def voice_bot_end():
212
+ try:
213
+ controller = get_voice_bot_controller()
214
+ return await controller.end_session()
215
+ except Exception as e:
216
+ raise HTTPException(status_code=500, detail=str(e))
217
+
218
+ @app.get("/voice-bot/history")
219
+ async def voice_bot_history():
220
+ try:
221
+ controller = get_voice_bot_controller()
222
+ return {"history": controller.get_session_history()}
223
+ except Exception as e:
224
+ raise HTTPException(status_code=500, detail=str(e))
225
+
226
+ @app.on_event("startup")
227
+ async def startup_event():
228
+ logger.info(f"Backend started - Config: {settings.groq_model}")
229
+
230
+ @app.on_event("shutdown")
231
+ async def shutdown_event():
232
+ logger.info("Backend shutdown")
233
+
234
+ if __name__ == "__main__":
235
+ logger.info("Starting FastAPI server...")
236
+ uvicorn.run(
237
+ app,
238
+ host="0.0.0.0",
239
+ port=8000,
240
+ log_level="info"
241
+ )
backend/voice_bot_controller.py ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Voice Bot Controller - Session management for conversations"""
2
+
3
+ from typing import Dict, Any
4
+ from datetime import datetime
5
+ import asyncio
6
+ from rag.session_manager import get_session_manager
7
+ from rag.cache_manager import get_cache_manager
8
+ from rag.tts_generator import get_tts_generator
9
+ from orchestration.langgraph_workflow import run_workflow
10
+
11
+
12
+ class VoiceBotController:
13
+ def __init__(self):
14
+ self.session_mgr = get_session_manager()
15
+ self.cache_mgr = get_cache_manager()
16
+ self.tts_gen = get_tts_generator()
17
+ self.current_session = None
18
+ self.customer_id = None
19
+ self.conversation_history = []
20
+
21
+ async def start_session(self, customer_id: str) -> Dict[str, Any]:
22
+ self.customer_id = customer_id
23
+ self.current_session = self.session_mgr.create_session(customer_id)
24
+ self.conversation_history = []
25
+
26
+ greeting = "Hello! How can I help you today?"
27
+ audio_path = self.tts_gen.generate_greeting(customer_id)
28
+
29
+ return {
30
+ "session_id": self.current_session,
31
+ "greeting": greeting,
32
+ "audio_path": audio_path,
33
+ "status": "listening"
34
+ }
35
+
36
+ async def process_user_message(self, user_message: str) -> Dict[str, Any]:
37
+ if not self.current_session:
38
+ return {"error": "No active session"}
39
+
40
+ self.session_mgr.add_message(self.current_session, "user", user_message)
41
+
42
+ cached_response = self.cache_mgr.get(self.customer_id, user_message)
43
+ if cached_response:
44
+ response_text = cached_response.get("response_text", "")
45
+ intent = cached_response.get("intent", {}).get("intent", "")
46
+ sentiment = cached_response.get("sentiment", {}).get("label", "")
47
+ else:
48
+ try:
49
+ result = await run_workflow(user_message, self.customer_id)
50
+ response_text = result.get("response", "")
51
+ intent = result.get("intent", {}).get("intent", "")
52
+ sentiment = result.get("sentiment", {}).get("label", "")
53
+ self.cache_mgr.set(self.customer_id, user_message, result)
54
+ except Exception as e:
55
+ response_text = f"Error processing request: {str(e)}"
56
+ intent = "error"
57
+ sentiment = "NEGATIVE"
58
+
59
+ self.session_mgr.add_message(self.current_session, "assistant", response_text, intent=intent, sentiment=sentiment)
60
+
61
+ follow_up = self._generate_follow_up(intent, sentiment)
62
+ should_continue = self._should_continue(intent, sentiment)
63
+ audio_path = self.tts_gen.generate_audio(response_text, self.customer_id, self.current_session)
64
+
65
+ return {
66
+ "response": response_text,
67
+ "intent": intent,
68
+ "sentiment": sentiment,
69
+ "follow_up": follow_up,
70
+ "audio_path": audio_path,
71
+ "status": "listening" if should_continue else "done",
72
+ "session_id": self.current_session
73
+ }
74
+
75
+ def _generate_follow_up(self, intent: str, sentiment: str) -> str:
76
+ """Generate context-aware follow-up question"""
77
+ follow_ups = {
78
+ "refund_request": "Would you like assistance with starting a return?",
79
+ "product_inquiry": "Do you need more details about this product?",
80
+ "billing_issue": "Can I help you further with your billing concern?",
81
+ "warranty_claim": "Would you like to proceed with the warranty claim?",
82
+ "order_status": "Is there anything else about your order?",
83
+ "complaint": "How can I make this right for you?",
84
+ "general_support": "Is there anything else I can help you with?"
85
+ }
86
+
87
+ # Choose follow-up based on intent
88
+ if intent in follow_ups:
89
+ return follow_ups[intent]
90
+
91
+ # Default follow-ups based on sentiment
92
+ if sentiment == "NEGATIVE":
93
+ return "I apologize for the inconvenience. Is there anything else I can help resolve?"
94
+ elif sentiment == "POSITIVE":
95
+ return "Great! Is there anything else I can help you with today?"
96
+ else:
97
+ return "Is there anything else I can help you with?"
98
+
99
+ def _should_continue(self, intent: str, sentiment: str) -> bool:
100
+ """Determine if conversation should continue"""
101
+ # Continue unless user explicitly ends or issue resolved
102
+ end_indicators = ["goodbye", "thanks", "that's it", "no thanks"]
103
+
104
+ # For now, always continue unless error
105
+ return intent != "error"
106
+
107
+ async def end_session(self) -> Dict[str, Any]:
108
+ if self.current_session:
109
+ self.session_mgr.close_session(self.current_session)
110
+ history = self.session_mgr.get_session_history(self.current_session)
111
+ return {
112
+ "session_id": self.current_session,
113
+ "status": "closed",
114
+ "message_count": len(history),
115
+ "farewell": "Thank you for contacting us. Goodbye!"
116
+ }
117
+ return {"error": "No active session"}
118
+
119
+ def get_session_history(self) -> list:
120
+ if not self.current_session:
121
+ return []
122
+ return self.session_mgr.get_session_history(self.current_session)
123
+
124
+
125
+ # Global controller instance
126
+ _voice_bot_controller = None
127
+
128
+
129
+ def get_voice_bot_controller() -> VoiceBotController:
130
+ """Get or create global voice bot controller"""
131
+ global _voice_bot_controller
132
+ if _voice_bot_controller is None:
133
+ _voice_bot_controller = VoiceBotController()
134
+ return _voice_bot_controller
data/latency_results.json ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "timestamp": "2026-06-02T20:29:02.891174",
4
+ "total_time_ms": 38587.39,
5
+ "modules": {
6
+ "sentiment_analysis": 0.0,
7
+ "entity_extraction": 1627.82,
8
+ "intent_detection": 881.28,
9
+ "retrieval_router": 10918.91,
10
+ "context_builder": 0.0,
11
+ "response_generation": 1045.21,
12
+ "validation": 1.73,
13
+ "memory_persistence": 743.59,
14
+ "tts_generation": 23313.93,
15
+ "workflow_orchestration": 38587.39
16
+ },
17
+ "breakdown_percent": {
18
+ "sentiment_analysis": 0.0,
19
+ "entity_extraction": 2.1,
20
+ "intent_detection": 1.1,
21
+ "retrieval_router": 14.2,
22
+ "context_builder": 0.0,
23
+ "response_generation": 1.4,
24
+ "validation": 0.0,
25
+ "memory_persistence": 1.0,
26
+ "tts_generation": 30.2,
27
+ "workflow_orchestration": 50.0
28
+ }
29
+ },
30
+ {
31
+ "timestamp": "2026-06-02T20:32:27.270235",
32
+ "total_time_ms": 18292.14,
33
+ "modules": {
34
+ "sentiment_analysis": 0.0,
35
+ "entity_extraction": 1851.01,
36
+ "intent_detection": 879.92,
37
+ "retrieval_router": 11678.87,
38
+ "context_builder": 0.0,
39
+ "response_generation": 935.09,
40
+ "validation": 0.0,
41
+ "memory_persistence": 518.29,
42
+ "tts_generation": 2400.3,
43
+ "workflow_orchestration": 18292.14
44
+ },
45
+ "breakdown_percent": {
46
+ "sentiment_analysis": 0.0,
47
+ "entity_extraction": 5.1,
48
+ "intent_detection": 2.4,
49
+ "retrieval_router": 31.9,
50
+ "context_builder": 0.0,
51
+ "response_generation": 2.6,
52
+ "validation": 0.0,
53
+ "memory_persistence": 1.4,
54
+ "tts_generation": 6.6,
55
+ "workflow_orchestration": 50.0
56
+ }
57
+ },
58
+ {
59
+ "timestamp": "2026-06-02T20:33:09.830661",
60
+ "total_time_ms": 6769.27,
61
+ "modules": {
62
+ "sentiment_analysis": 0.0,
63
+ "entity_extraction": 489.84,
64
+ "intent_detection": 670.14,
65
+ "retrieval_router": 2088.21,
66
+ "context_builder": 0.0,
67
+ "response_generation": 850.0,
68
+ "validation": 0.0,
69
+ "memory_persistence": 602.15,
70
+ "tts_generation": 2051.77,
71
+ "workflow_orchestration": 6769.27
72
+ },
73
+ "breakdown_percent": {
74
+ "sentiment_analysis": 0.0,
75
+ "entity_extraction": 3.6,
76
+ "intent_detection": 5.0,
77
+ "retrieval_router": 15.4,
78
+ "context_builder": 0.0,
79
+ "response_generation": 6.3,
80
+ "validation": 0.0,
81
+ "memory_persistence": 4.5,
82
+ "tts_generation": 15.2,
83
+ "workflow_orchestration": 50.1
84
+ }
85
+ },
86
+ {
87
+ "timestamp": "2026-06-02T20:48:50.209913",
88
+ "total_time_ms": 7611.41,
89
+ "modules": {
90
+ "sentiment_analysis": 0.0,
91
+ "entity_extraction": 521.71,
92
+ "intent_detection": 869.67,
93
+ "retrieval_router": 1815.81,
94
+ "context_builder": 0.0,
95
+ "response_generation": 881.13,
96
+ "validation": 0.62,
97
+ "memory_persistence": 569.26,
98
+ "tts_generation": 2904.53,
99
+ "workflow_orchestration": 7611.41
100
+ },
101
+ "breakdown_percent": {
102
+ "sentiment_analysis": 0.0,
103
+ "entity_extraction": 3.4,
104
+ "intent_detection": 5.7,
105
+ "retrieval_router": 12.0,
106
+ "context_builder": 0.0,
107
+ "response_generation": 5.8,
108
+ "validation": 0.0,
109
+ "memory_persistence": 3.8,
110
+ "tts_generation": 19.1,
111
+ "workflow_orchestration": 50.2
112
+ }
113
+ },
114
+ {
115
+ "timestamp": "2026-06-02T20:50:03.048163",
116
+ "total_time_ms": 3904.49,
117
+ "modules": {
118
+ "sentiment_analysis": 0.0,
119
+ "entity_extraction": 451.61,
120
+ "intent_detection": 712.09,
121
+ "retrieval_router": 296.21,
122
+ "context_builder": 0.0,
123
+ "response_generation": 682.71,
124
+ "validation": 0.0,
125
+ "memory_persistence": 456.91,
126
+ "tts_generation": 1295.47,
127
+ "workflow_orchestration": 3904.49
128
+ },
129
+ "breakdown_percent": {
130
+ "sentiment_analysis": 0.0,
131
+ "entity_extraction": 5.8,
132
+ "intent_detection": 9.1,
133
+ "retrieval_router": 3.8,
134
+ "context_builder": 0.0,
135
+ "response_generation": 8.8,
136
+ "validation": 0.0,
137
+ "memory_persistence": 5.9,
138
+ "tts_generation": 16.6,
139
+ "workflow_orchestration": 50.1
140
+ }
141
+ }
142
+ ]
data/load_sample_data.py ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Step 13: Load Sample Data into Qdrant
3
+ Creates sample documents for knowledge base and customer history
4
+ """
5
+ import sys
6
+ from pathlib import Path
7
+
8
+ # Add project root to path
9
+ project_root = Path(__file__).parent.parent
10
+ sys.path.insert(0, str(project_root))
11
+
12
+ import asyncio
13
+ import json
14
+ import logging
15
+ from datetime import datetime
16
+
17
+ logging.basicConfig(level=logging.INFO)
18
+ logger = logging.getLogger(__name__)
19
+
20
+ print("\n" + "="*80)
21
+ print("📚 LOADING SAMPLE DATA INTO QDRANT")
22
+ print("="*80)
23
+
24
+ # ============================================================================
25
+ # SAMPLE DATA
26
+ # ============================================================================
27
+
28
+ # Knowledge Base Documents (Company Policies, FAQs)
29
+ KB_DOCUMENTS = [
30
+ {
31
+ "id": "kb_001",
32
+ "title": "Return Policy",
33
+ "content": """
34
+ Return Policy: Customers can return unopened products within 30 days of purchase
35
+ for a full refund. Items must be in original condition with all packaging and accessories.
36
+ Refunds are processed within 5-7 business days. Shipping costs are non-refundable unless
37
+ the return is due to our error. For defective items, we offer replacements immediately.
38
+ """
39
+ },
40
+ {
41
+ "id": "kb_002",
42
+ "title": "Shipping Information",
43
+ "content": """
44
+ Shipping Options: We offer standard shipping (5-7 days), express shipping (2-3 days),
45
+ and overnight shipping. Standard shipping is free for orders over $50. Tracking
46
+ information is provided via email. All orders are insured. We ship to most countries
47
+ worldwide. International orders may have customs delays.
48
+ """
49
+ },
50
+ {
51
+ "id": "kb_003",
52
+ "title": "Product Warranty",
53
+ "content": """
54
+ Warranty Coverage: All electronics come with a 1-year manufacturer's warranty covering
55
+ defects in materials and workmanship. Warranty does not cover physical damage, water damage,
56
+ or normal wear. Warranty service is available through our support team or authorized
57
+ service centers. Extended warranty options are available for 2 or 3 years.
58
+ """
59
+ },
60
+ {
61
+ "id": "kb_004",
62
+ "title": "Account Management",
63
+ "content": """
64
+ Account Features: Create an account to track orders, save preferences, and manage
65
+ payment methods. Password requirements: minimum 8 characters with upper/lowercase,
66
+ numbers, and symbols. Two-factor authentication available for security. Account
67
+ information can be updated anytime in settings. Contact support to delete account.
68
+ """
69
+ }
70
+ ]
71
+
72
+ # Customer History Records (Previous Interactions)
73
+ CUSTOMER_HISTORY = [
74
+ {
75
+ "customer_id": "CUST_001",
76
+ "interaction_type": "complaint",
77
+ "text": "Customer complained about slow shipping on previous order. Resolution: expedited reshipment provided."
78
+ },
79
+ {
80
+ "customer_id": "CUST_001",
81
+ "interaction_type": "purchase",
82
+ "text": "Purchased laptop model XPS-15 on 2025-11-20. Status: delivered. Customer satisfied."
83
+ },
84
+ {
85
+ "customer_id": "CUST_002",
86
+ "interaction_type": "inquiry",
87
+ "text": "Asked about warranty coverage for defective phone. Explained 1-year coverage policy. Customer satisfied."
88
+ },
89
+ {
90
+ "customer_id": "CUST_002",
91
+ "interaction_type": "refund_request",
92
+ "text": "Requested refund for unopened tablet within 30-day window. Refund approved and processed."
93
+ }
94
+ ]
95
+
96
+ # ============================================================================
97
+ # LOAD DATA
98
+ # ============================================================================
99
+
100
+ async def load_sample_data():
101
+ """Load sample data into Qdrant"""
102
+
103
+ try:
104
+ from rag.qdrant_manager import qdrant_manager
105
+ from rag.embedding_manager import embedding_manager
106
+
107
+ print("\n[1] Initializing managers...")
108
+ print(f" ✅ Qdrant Manager: {qdrant_manager}")
109
+ print(f" ✅ Embedding Manager: {embedding_manager}")
110
+
111
+ # ========== LOAD KNOWLEDGE BASE ==========
112
+ print("\n[2] Loading Knowledge Base documents...")
113
+ print(f" Documents to load: {len(KB_DOCUMENTS)}")
114
+
115
+ for doc in KB_DOCUMENTS:
116
+ try:
117
+ # Create document object for Qdrant
118
+ text = f"Title: {doc['title']}\n\n{doc['content']}"
119
+ logger.info(f"Adding KB doc: {doc['id']}")
120
+
121
+ # Add to knowledge base using qdrant_manager
122
+ qdrant_manager.add_to_kb(
123
+ documents=[{
124
+ "id": doc['id'],
125
+ "text": text,
126
+ "title": doc['title']
127
+ }]
128
+ )
129
+ print(f" ✅ {doc['title']} (ID: {doc['id']})")
130
+
131
+ except Exception as e:
132
+ print(f" ❌ Error loading {doc['id']}: {str(e)}")
133
+
134
+ print(" ✅ Knowledge Base loaded")
135
+
136
+ # ========== LOAD CUSTOMER HISTORY ==========
137
+ print("\n[3] Loading Customer History...")
138
+ print(f" Records to load: {len(CUSTOMER_HISTORY)}")
139
+
140
+ for record in CUSTOMER_HISTORY:
141
+ try:
142
+ logger.info(f"Adding history for {record['customer_id']}")
143
+
144
+ # Add to customer history using qdrant_manager
145
+ qdrant_manager.add_to_history(
146
+ customer_id=record['customer_id'],
147
+ text=record['text'],
148
+ interaction_type=record['interaction_type']
149
+ )
150
+ print(f" ✅ {record['customer_id']}: {record['interaction_type']} ({len(record['text'])} chars)")
151
+
152
+ except Exception as e:
153
+ print(f" ❌ Error loading history for {record['customer_id']}: {str(e)}")
154
+
155
+ print(" ✅ Customer History loaded")
156
+
157
+ # ========== VERIFY DATA ==========
158
+ print("\n[4] Verifying loaded data...")
159
+
160
+ try:
161
+ kb_info = qdrant_manager.get_collection_info("knowledge_base")
162
+ print(f" ✅ Knowledge Base:")
163
+ print(f" - Name: knowledge_base")
164
+ print(f" - Vector size: {kb_info.get('vector_size', 'N/A')}")
165
+ print(f" - Points count: {kb_info.get('points_count', 'N/A')}")
166
+
167
+ hist_info = qdrant_manager.get_collection_info("customer_history")
168
+ print(f" ✅ Customer History:")
169
+ print(f" - Name: customer_history")
170
+ print(f" - Vector size: {hist_info.get('vector_size', 'N/A')}")
171
+ print(f" - Points count: {hist_info.get('points_count', 'N/A')}")
172
+ except Exception as e:
173
+ print(f" ⚠️ Could not verify: {str(e)}")
174
+
175
+ print("\n" + "="*80)
176
+ print("✅ SAMPLE DATA LOADED SUCCESSFULLY")
177
+ print("="*80)
178
+ print("\n📊 Summary:")
179
+ print(f" • Knowledge Base: {len(KB_DOCUMENTS)} documents loaded")
180
+ print(f" • Customer History: {len(CUSTOMER_HISTORY)} records loaded")
181
+ print("\n🎯 Data is now available for:")
182
+ print(" • KB Search (retrieval_router node)")
183
+ print(" • Customer History Context (conditional retrieval)")
184
+ print(" • Personalized responses based on customer history")
185
+ print("="*80 + "\n")
186
+
187
+ return True
188
+
189
+ except Exception as e:
190
+ print(f"\n❌ ERROR: {str(e)}")
191
+ import traceback
192
+ traceback.print_exc()
193
+ return False
194
+
195
+
196
+ if __name__ == "__main__":
197
+ print("\n🚀 Starting sample data loader...\n")
198
+
199
+ success = asyncio.run(load_sample_data())
200
+
201
+ sys.exit(0 if success else 1)
data/sessions.db ADDED
Binary file (32.8 kB). View file
 
data/test_latency.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "timestamp": "2026-06-02T20:03:43.290341",
4
+ "total_time_ms": 51.15,
5
+ "modules": {
6
+ "test_module": 51.15
7
+ },
8
+ "breakdown_percent": {
9
+ "test_module": 100.0
10
+ }
11
+ },
12
+ {
13
+ "timestamp": "2026-06-02T20:04:07.232259",
14
+ "total_time_ms": 50.66,
15
+ "modules": {
16
+ "test_module": 50.66
17
+ },
18
+ "breakdown_percent": {
19
+ "test_module": 100.0
20
+ }
21
+ }
22
+ ]
frontend/__pycache__/streamlit_app.cpython-311.pyc ADDED
Binary file (32.2 kB). View file
 
frontend/streamlit_app.py ADDED
@@ -0,0 +1,739 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Streamlit Frontend - Voice RAG Bot
3
+ Interactive UI for audio input, processing, and response playback
4
+ """
5
+
6
+ import streamlit as st
7
+ import requests
8
+ import json
9
+ import os
10
+ import time
11
+ import base64
12
+ from pathlib import Path
13
+ from datetime import datetime
14
+ from typing import Optional, Dict, Any
15
+
16
+ # Page configuration
17
+ st.set_page_config(
18
+ page_title="Voice RAG Bot",
19
+ page_icon="🤖",
20
+ layout="wide",
21
+ initial_sidebar_state="expanded"
22
+ )
23
+
24
+ # Styling
25
+ st.markdown("""
26
+ <style>
27
+ .main {
28
+ padding: 0rem 1rem;
29
+ }
30
+ .stTabs [data-baseweb="tab-list"] button [data-testid="stMarkdownContainer"] p {
31
+ font-size: 1.1rem;
32
+ font-weight: 500;
33
+ }
34
+ .success-box {
35
+ padding: 1rem;
36
+ border-radius: 0.5rem;
37
+ background-color: #d4edda;
38
+ border: 1px solid #c3e6cb;
39
+ color: #155724;
40
+ }
41
+ .error-box {
42
+ padding: 1rem;
43
+ border-radius: 0.5rem;
44
+ background-color: #f8d7da;
45
+ border: 1px solid #f5c6cb;
46
+ color: #721c24;
47
+ }
48
+ .info-box {
49
+ padding: 1rem;
50
+ border-radius: 0.5rem;
51
+ background-color: #d1ecf1;
52
+ border: 1px solid #bee5eb;
53
+ color: #0c5460;
54
+ }
55
+ </style>
56
+ """, unsafe_allow_html=True)
57
+
58
+ # ============================================================================
59
+ # CONFIGURATION
60
+ # ============================================================================
61
+ BACKEND_URL = os.getenv("BACKEND_URL", "http://localhost:8000")
62
+ DATA_DIR = Path("data/audio_output")
63
+ DATA_DIR.mkdir(parents=True, exist_ok=True)
64
+
65
+ # Session state initialization
66
+ if "customer_id" not in st.session_state:
67
+ st.session_state.customer_id = "CUST_001"
68
+ if "processing" not in st.session_state:
69
+ st.session_state.processing = False
70
+ if "last_response" not in st.session_state:
71
+ st.session_state.last_response = None
72
+ if "history" not in st.session_state:
73
+ st.session_state.history = []
74
+ if "voice_bot_mode" not in st.session_state:
75
+ st.session_state.voice_bot_mode = False
76
+ if "voice_bot_session" not in st.session_state:
77
+ st.session_state.voice_bot_session = None
78
+ if "voice_bot_active" not in st.session_state:
79
+ st.session_state.voice_bot_active = False
80
+ if "voice_bot_messages" not in st.session_state:
81
+ st.session_state.voice_bot_messages = []
82
+ if "pending_audio" not in st.session_state:
83
+ st.session_state.pending_audio = None
84
+ if "processing_audio" not in st.session_state:
85
+ st.session_state.processing_audio = False
86
+ if "last_processed_audio_id" not in st.session_state:
87
+ st.session_state.last_processed_audio_id = None
88
+
89
+ # ============================================================================
90
+ # UTILITY FUNCTIONS
91
+ # ============================================================================
92
+
93
+ def check_backend_health() -> bool:
94
+ """Check if FastAPI backend is running"""
95
+ try:
96
+ response = requests.get(f"{BACKEND_URL}/health", timeout=5)
97
+ return response.status_code == 200
98
+ except requests.exceptions.ConnectionError:
99
+ return False
100
+ except requests.exceptions.Timeout:
101
+ return False
102
+ except Exception as e:
103
+ return False
104
+
105
+
106
+ def process_audio_file(audio_bytes: bytes, customer_id: str) -> Optional[Dict[str, Any]]:
107
+ """Send audio to backend for processing"""
108
+ try:
109
+ from io import BytesIO
110
+
111
+ # Send audio bytes directly as file to backend
112
+ with st.spinner("Processing audio... (may take 30-60 seconds)"):
113
+ # Create file-like object from bytes
114
+ audio_file = BytesIO(audio_bytes)
115
+ audio_file.name = f"audio_{datetime.now().strftime('%Y%m%d_%H%M%S')}.wav"
116
+
117
+ files = {"file": (audio_file.name, audio_file, "audio/wav")}
118
+ response = requests.post(
119
+ f"{BACKEND_URL}/process-audio",
120
+ files=files,
121
+ params={"customer_id": customer_id},
122
+ timeout=120
123
+ )
124
+
125
+ if response.status_code == 200:
126
+ result = response.json()
127
+ return result
128
+ else:
129
+ st.error(f"Backend error: {response.status_code}")
130
+ st.error(response.text)
131
+ return None
132
+
133
+ except requests.exceptions.Timeout:
134
+ st.error("Request timeout. Processing took too long.")
135
+ return None
136
+ except Exception as e:
137
+ st.error(f"Error processing audio: {str(e)}")
138
+ import traceback
139
+ st.error(traceback.format_exc())
140
+ return None
141
+
142
+
143
+ def process_text_input(user_input: str, customer_id: str) -> Optional[Dict[str, Any]]:
144
+ """Send text to backend for processing"""
145
+ try:
146
+ with st.spinner("Processing text... (may take 20-30 seconds)"):
147
+ response = requests.post(
148
+ f"{BACKEND_URL}/process-text",
149
+ params={
150
+ "user_input": user_input,
151
+ "customer_id": customer_id
152
+ },
153
+ timeout=120
154
+ )
155
+
156
+ if response.status_code == 200:
157
+ return response.json()
158
+ else:
159
+ st.error(f"Backend error: {response.status_code}")
160
+ st.error(response.text)
161
+ return None
162
+
163
+ except requests.exceptions.Timeout:
164
+ st.error("Request timeout. Processing took too long.")
165
+ return None
166
+ except Exception as e:
167
+ st.error(f"Error processing text: {str(e)}")
168
+ return None
169
+
170
+
171
+
172
+
173
+ def voice_bot_start(customer_id: str) -> Optional[Dict[str, Any]]:
174
+ """Start voice bot session"""
175
+ try:
176
+ response = requests.post(
177
+ f"{BACKEND_URL}/voice-bot/start",
178
+ params={"customer_id": customer_id},
179
+ timeout=60
180
+ )
181
+
182
+ if response.status_code == 200:
183
+ return response.json()
184
+ else:
185
+ st.error(f"Error starting voice bot: {response.status_code}")
186
+ return None
187
+ except Exception as e:
188
+ st.error(f"Error starting voice bot: {str(e)}")
189
+ return None
190
+
191
+
192
+ def voice_bot_process_message(user_message: str) -> Optional[Dict[str, Any]]:
193
+ """Process message in voice bot session"""
194
+ try:
195
+ response = requests.post(
196
+ f"{BACKEND_URL}/voice-bot/message",
197
+ params={"user_message": user_message},
198
+ timeout=120
199
+ )
200
+
201
+ if response.status_code == 200:
202
+ return response.json()
203
+ else:
204
+ st.error(f"Backend error {response.status_code}: {response.text}")
205
+ return None
206
+ except Exception as e:
207
+ st.error(f"Backend connection error: {str(e)}")
208
+ return None
209
+
210
+
211
+ def voice_bot_end() -> Optional[Dict[str, Any]]:
212
+ """End voice bot session"""
213
+ try:
214
+ response = requests.post(
215
+ f"{BACKEND_URL}/voice-bot/end",
216
+ timeout=10
217
+ )
218
+
219
+ if response.status_code == 200:
220
+ return response.json()
221
+ else:
222
+ return None
223
+ except Exception as e:
224
+ return None
225
+
226
+
227
+ def display_response_results(response: Dict[str, Any]):
228
+ """Display formatted response from backend"""
229
+
230
+ # Display latency metrics first if available
231
+ latency_metrics = response.get("latency_metrics")
232
+ if latency_metrics:
233
+ st.markdown("### ⏱️ Performance Metrics")
234
+
235
+ total_time = latency_metrics.get("total_time_ms", 0)
236
+ modules = latency_metrics.get("modules", {})
237
+ breakdown = latency_metrics.get("breakdown_percent", {})
238
+
239
+ # Display total time prominently
240
+ col1, col2, col3 = st.columns(3)
241
+ with col1:
242
+ st.metric("Total Processing Time", f"{total_time:.0f} ms", f"{total_time/1000:.2f}s")
243
+ with col2:
244
+ fastest = min(modules.items(), key=lambda x: x[1]) if modules else ("N/A", 0)
245
+ st.metric("Fastest Module", fastest[0].replace("_", " ").title(), f"{fastest[1]:.0f} ms")
246
+ with col3:
247
+ slowest = max(modules.items(), key=lambda x: x[1]) if modules else ("N/A", 0)
248
+ st.metric("Slowest Module", slowest[0].replace("_", " ").title(), f"{slowest[1]:.0f} ms")
249
+
250
+ # Module breakdown with progress bars
251
+ with st.expander("📊 Detailed Module Breakdown", expanded=True):
252
+ st.markdown("#### Time per Module")
253
+
254
+ # Sort modules by time
255
+ sorted_modules = sorted(modules.items(), key=lambda x: x[1], reverse=True)
256
+
257
+ for module_name, time_ms in sorted_modules:
258
+ percent = breakdown.get(module_name, 0)
259
+ display_name = module_name.replace("_", " ").title()
260
+
261
+ col1, col2, col3 = st.columns([3, 1, 1])
262
+ with col1:
263
+ st.write(f"**{display_name}**")
264
+ with col2:
265
+ st.write(f"{time_ms:.2f} ms")
266
+ with col3:
267
+ st.write(f"{percent:.1f}%")
268
+
269
+ # Progress bar
270
+ st.progress(percent / 100)
271
+
272
+ st.markdown("---")
273
+
274
+ # Create tabs for different result sections
275
+ tabs = st.tabs([
276
+ "📝 Response",
277
+ "🎯 Intent",
278
+ "😊 Sentiment",
279
+ "🏷️ Entities",
280
+ "📚 Knowledge Base",
281
+ "📜 History",
282
+ "🔊 Audio"
283
+ ])
284
+
285
+ # Tab 1: Main Response
286
+ with tabs[0]:
287
+ st.markdown("### Generated Response")
288
+ st.info(response.get("response_text", "No response generated"))
289
+
290
+ # Save to history
291
+ st.session_state.history.append({
292
+ "timestamp": datetime.now().isoformat(),
293
+ "customer_id": st.session_state.customer_id,
294
+ "response": response.get("response_text", ""),
295
+ "intent": response.get("intent", {}).get("intent", ""),
296
+ "sentiment": response.get("sentiment", {}).get("label", "")
297
+ })
298
+
299
+ # Tab 2: Intent Detection
300
+ with tabs[1]:
301
+ intent_data = response.get("intent", {})
302
+ col1, col2 = st.columns(2)
303
+
304
+ with col1:
305
+ st.metric("Detected Intent", intent_data.get("intent", "N/A"))
306
+ with col2:
307
+ confidence = intent_data.get("confidence", 0)
308
+ st.metric("Confidence", f"{confidence:.1%}")
309
+
310
+ # Intent explanation
311
+ intent_types = {
312
+ "refund_request": "Customer wants to return/refund a product",
313
+ "order_status": "Customer inquiring about order tracking",
314
+ "product_inquiry": "Customer asking product details",
315
+ "billing_issue": "Customer has billing/payment problems",
316
+ "warranty_claim": "Customer filing warranty claim",
317
+ "account_management": "Account settings/updates",
318
+ "general_support": "General support request",
319
+ "complaint": "Customer complaint",
320
+ "other": "Other inquiry"
321
+ }
322
+
323
+ intent = intent_data.get("intent", "")
324
+ if intent in intent_types:
325
+ st.write(f"**Category**: {intent_types[intent]}")
326
+
327
+ # Tab 3: Sentiment Analysis
328
+ with tabs[2]:
329
+ sentiment_data = response.get("sentiment", {})
330
+ label = sentiment_data.get("label", "NEUTRAL")
331
+ score = sentiment_data.get("score", 0)
332
+
333
+ # Color-coded sentiment display
334
+ if label == "POSITIVE":
335
+ color = "🟢"
336
+ tone = "Positive"
337
+ elif label == "NEGATIVE":
338
+ color = "🔴"
339
+ tone = "Negative"
340
+ else:
341
+ color = "🟡"
342
+ tone = "Neutral"
343
+
344
+ col1, col2 = st.columns(2)
345
+ with col1:
346
+ st.metric("Sentiment", f"{color} {tone}")
347
+ with col2:
348
+ st.metric("Confidence", f"{score:.1%}")
349
+
350
+ st.write(f"**Interpretation**: Response was generated with {tone.lower()}-{tone.lower()} tone")
351
+
352
+ # Tab 4: Entities
353
+ with tabs[3]:
354
+ entities = response.get("entities", {})
355
+ if entities:
356
+ for entity_type, values in entities.items():
357
+ if values:
358
+ st.write(f"**{entity_type.upper()}**")
359
+ for entity in values:
360
+ st.write(f" • {entity}")
361
+ else:
362
+ st.info("No entities extracted from input")
363
+
364
+ # Tab 5: Knowledge Base Context
365
+ with tabs[4]:
366
+ kb_context = response.get("kb_context", "")
367
+ if kb_context and isinstance(kb_context, str) and kb_context.strip() != "No relevant policies found.":
368
+ st.write("**Retrieved Documents:**")
369
+ st.write(kb_context)
370
+ else:
371
+ st.info("No KB documents retrieved")
372
+
373
+ # Tab 6: Customer History
374
+ with tabs[5]:
375
+ history_context = response.get("history_context", "")
376
+ if history_context and isinstance(history_context, str) and history_context.strip() != "No customer history available.":
377
+ st.write("**Customer History:**")
378
+ st.write(history_context)
379
+ else:
380
+ st.info("No customer history found")
381
+
382
+ # Tab 7: Audio Output
383
+ with tabs[6]:
384
+ audio_path = response.get("audio_path", "")
385
+ if audio_path and audio_path.strip():
386
+ try:
387
+ # Normalize path
388
+ audio_file_path = Path(audio_path.replace("\\", "/"))
389
+ if not audio_file_path.is_absolute():
390
+ project_root = Path(__file__).parent.parent
391
+ audio_file_path = project_root / audio_file_path
392
+
393
+ if audio_file_path.exists():
394
+ st.write(f"**Audio file**: {audio_path}")
395
+ with open(audio_file_path, "rb") as audio_file:
396
+ st.audio(audio_file, format="audio/mp3")
397
+ else:
398
+ st.warning(f"Audio file not found: {audio_file_path}")
399
+ except Exception as e:
400
+ st.error(f"Could not load audio file: {str(e)}")
401
+ else:
402
+ st.warning("No audio file generated")
403
+
404
+
405
+ # ============================================================================
406
+ # MAIN UI LAYOUT
407
+ # ============================================================================
408
+
409
+ # Header
410
+ st.title("🤖 Voice RAG Bot")
411
+ st.markdown("AI Customer Support with Voice Recognition and Retrieval-Augmented Generation")
412
+
413
+ # Sidebar
414
+ with st.sidebar:
415
+ st.header("⚙️ Configuration")
416
+
417
+ # Backend status with refresh
418
+ col1, col2 = st.columns([3, 1])
419
+ with col1:
420
+ st.write("**Backend Status**")
421
+ with col2:
422
+ if st.button("🔄", help="Refresh status", key="refresh_health"):
423
+ st.rerun()
424
+
425
+ backend_healthy = check_backend_health()
426
+ if backend_healthy:
427
+ st.success("✅ Backend Connected")
428
+ st.caption(f"URL: {BACKEND_URL}")
429
+ else:
430
+ st.error("❌ Backend Not Connected")
431
+ st.error(f"Cannot reach {BACKEND_URL}")
432
+ st.info("**To fix:**")
433
+ st.code("python -m uvicorn backend.main:app --reload --port 8000", language="bash")
434
+ st.info("**Or use startup script:**")
435
+ st.code(".\\START_SYSTEM.ps1", language="bash")
436
+
437
+ # Customer ID input
438
+ st.subheader("Customer Information")
439
+ customer_id = st.text_input(
440
+ "Customer ID",
441
+ value=st.session_state.customer_id,
442
+ help="Unique identifier for customer (used for history)"
443
+ )
444
+ st.session_state.customer_id = customer_id
445
+
446
+ st.divider()
447
+
448
+ # Model information
449
+ st.subheader("System Components")
450
+ st.write("**LLM**: Groq (gpt-oss-20b)")
451
+ st.write("**STT**: Faster Whisper (base)")
452
+ st.write("**Vector DB**: Qdrant")
453
+ st.write("**Embeddings**: BGE-M3 (1024-dim)")
454
+ st.write("**Sentiment**: DistilBERT")
455
+ st.write("**NER**: BERT-base-NER")
456
+
457
+ # Main content
458
+ st.divider()
459
+
460
+ # Voice Bot Mode Toggle
461
+ col1, col2, col3 = st.columns([1, 3, 1])
462
+ with col1:
463
+ voice_bot_enabled = st.toggle("🤖 Voice Bot Mode", value=st.session_state.voice_bot_mode, key="voice_bot_toggle")
464
+ st.session_state.voice_bot_mode = voice_bot_enabled
465
+
466
+ if voice_bot_enabled:
467
+ # Voice Bot Interface
468
+ st.markdown("### 🎙️ Voice Bot Assistant")
469
+
470
+ if not st.session_state.voice_bot_active:
471
+ # Start button
472
+ col1, col2, col3 = st.columns([1, 2, 1])
473
+ with col2:
474
+ if st.button("🎙️ Start Conversation", use_container_width=True, key="start_voice_bot"):
475
+ with st.spinner("Starting voice bot..."):
476
+ result = voice_bot_start(st.session_state.customer_id)
477
+ if result:
478
+ st.session_state.voice_bot_session = result.get("session_id")
479
+ st.session_state.voice_bot_active = True
480
+ greeting_audio = result.get("audio_path", "")
481
+ st.session_state.voice_bot_messages = [
482
+ {
483
+ "role": "assistant",
484
+ "content": result.get("greeting"),
485
+ "audio_path": greeting_audio
486
+ }
487
+ ]
488
+ st.rerun()
489
+
490
+ else:
491
+ # Conversation display
492
+ st.markdown("#### Conversation")
493
+
494
+ # Display conversation history
495
+ for msg in st.session_state.voice_bot_messages:
496
+ if msg["role"] == "assistant":
497
+ with st.chat_message("assistant", avatar="🤖"):
498
+ st.write(msg["content"])
499
+ # Play audio if available
500
+ audio_path = msg.get("audio_path", "")
501
+ if audio_path and audio_path.strip():
502
+ try:
503
+ # Normalize path and check in project root
504
+ audio_file_path = Path(audio_path.replace("\\", "/"))
505
+ if not audio_file_path.is_absolute():
506
+ project_root = Path(__file__).parent.parent
507
+ audio_file_path = project_root / audio_file_path
508
+
509
+ if audio_file_path.exists():
510
+ with open(audio_file_path, "rb") as audio_file:
511
+ audio_bytes = audio_file.read()
512
+ audio_b64 = base64.b64encode(audio_bytes).decode()
513
+ st.markdown(f"""
514
+ <audio autoplay controls style="width: 100%;">
515
+ <source src="data:audio/mpeg;base64,{audio_b64}" type="audio/mpeg">
516
+ </audio>
517
+ """, unsafe_allow_html=True)
518
+ else:
519
+ st.caption(f"⚠️ Audio file not found: {audio_file_path}")
520
+ except Exception as e:
521
+ st.caption(f"⚠️ Error loading audio: {str(e)}")
522
+ else:
523
+ with st.chat_message("user", avatar="👤"):
524
+ st.write(msg["content"])
525
+
526
+ # Voice conversation section
527
+ st.markdown("---")
528
+ st.markdown("#### 🎤 Record your message:")
529
+
530
+ # Voice input - Store audio in session state
531
+ audio_bytes = st.audio_input(
532
+ "Record your message",
533
+ label_visibility="collapsed",
534
+ key="voice_bot_audio_input"
535
+ )
536
+
537
+ # If new audio recorded, store it with unique ID
538
+ if audio_bytes:
539
+ audio_id = id(audio_bytes)
540
+ if audio_id != st.session_state.last_processed_audio_id:
541
+ st.session_state.pending_audio = audio_bytes
542
+ st.session_state.last_processed_audio_id = audio_id
543
+ st.session_state.processing_audio = True
544
+
545
+ # Process pending audio (happens on next render after audio is saved)
546
+ if st.session_state.pending_audio and st.session_state.processing_audio:
547
+ # Immediately mark as processing to prevent duplicate processing
548
+ st.session_state.processing_audio = False
549
+
550
+ st.info("🎤 Processing audio...")
551
+ try:
552
+ from io import BytesIO
553
+ from faster_whisper import WhisperModel
554
+
555
+ # Convert UploadedFile to bytes if needed
556
+ audio_data = st.session_state.pending_audio
557
+ if hasattr(audio_data, 'read'):
558
+ audio_data = audio_data.read()
559
+
560
+ st.info("Loading Whisper model...")
561
+ @st.cache_resource
562
+ def load_whisper():
563
+ return WhisperModel("base", device="cpu", compute_type="int8")
564
+
565
+ whisper = load_whisper()
566
+ st.success("✅ Whisper model loaded")
567
+
568
+ st.info("Transcribing audio...")
569
+ audio_file = BytesIO(audio_data)
570
+ segments, info = whisper.transcribe(audio_file, language="en")
571
+ transcribed_text = " ".join([segment.text for segment in segments])
572
+
573
+ if transcribed_text.strip():
574
+ st.success(f"✅ Transcribed: {transcribed_text}")
575
+
576
+ # Add user message
577
+ st.session_state.voice_bot_messages.append({
578
+ "role": "user",
579
+ "content": f"🎤 {transcribed_text}"
580
+ })
581
+
582
+ st.info("🤖 Sending to bot...")
583
+ result = voice_bot_process_message(transcribed_text)
584
+
585
+ if result:
586
+ response = result.get("response", "")
587
+ audio_path = result.get("audio_path", "")
588
+
589
+ if response:
590
+ st.success("✅ Bot responded")
591
+ # Add ONLY ONE bot response
592
+ st.session_state.voice_bot_messages.append({
593
+ "role": "assistant",
594
+ "content": response,
595
+ "audio_path": audio_path
596
+ })
597
+
598
+ # Clear pending audio immediately
599
+ st.session_state.pending_audio = None
600
+ st.session_state.processing_audio = False
601
+ else:
602
+ st.error("❌ Bot response is empty")
603
+ st.session_state.pending_audio = None
604
+ st.session_state.processing_audio = False
605
+ else:
606
+ st.error("❌ Backend returned None")
607
+ st.session_state.pending_audio = None
608
+ st.session_state.processing_audio = False
609
+ else:
610
+ st.warning("⚠️ No speech detected in audio")
611
+ st.session_state.pending_audio = None
612
+ st.session_state.processing_audio = False
613
+
614
+ except Exception as e:
615
+ st.error(f"❌ Error: {str(e)}")
616
+ st.session_state.pending_audio = None
617
+ st.session_state.processing_audio = False
618
+ import traceback
619
+ st.write(traceback.format_exc())
620
+
621
+ # End conversation button
622
+ st.markdown("---")
623
+ if st.button("🛑 End Conversation", use_container_width=True, key="end_voice_bot"):
624
+ with st.spinner("Ending session..."):
625
+ result = voice_bot_end()
626
+ st.session_state.voice_bot_active = False
627
+ st.session_state.voice_bot_messages = []
628
+ st.success("✅ Session ended. Thank you!")
629
+ st.rerun()
630
+
631
+ else:
632
+ # Regular Input Tabs
633
+ st.markdown("### 💬 Manual Input Mode")
634
+
635
+ # Tabs for input methods
636
+ input_tab1, input_tab2 = st.tabs(["🎤 Audio Input", "📝 Text Input"])
637
+
638
+ with input_tab1:
639
+ st.subheader("Upload or Record Audio")
640
+
641
+ col1, col2 = st.columns(2)
642
+
643
+ with col1:
644
+ st.write("**Option 1: Record Audio**")
645
+ audio_data = st.audio_input(
646
+ "Record your message",
647
+ label_visibility="collapsed",
648
+ key="audio_input"
649
+ )
650
+
651
+ if audio_data:
652
+ st.success("Audio recorded successfully!")
653
+ if st.button("🔄 Process Audio", key="process_audio_btn"):
654
+ response = process_audio_file(audio_data.getvalue(), st.session_state.customer_id)
655
+ if response:
656
+ st.session_state.last_response = response
657
+ st.success("✅ Processing complete!")
658
+ st.rerun()
659
+
660
+ with col2:
661
+ st.write("**Option 2: Upload Audio File**")
662
+ uploaded_file = st.file_uploader(
663
+ "Upload an MP3 or WAV file",
664
+ type=["mp3", "wav"],
665
+ label_visibility="collapsed"
666
+ )
667
+
668
+ if uploaded_file:
669
+ st.success(f"File uploaded: {uploaded_file.name}")
670
+ if st.button("🔄 Process Uploaded Audio", key="process_uploaded_btn"):
671
+ response = process_audio_file(uploaded_file.getvalue(), st.session_state.customer_id)
672
+ if response:
673
+ st.session_state.last_response = response
674
+ st.success("✅ Processing complete!")
675
+ st.rerun()
676
+
677
+ with input_tab2:
678
+ st.subheader("Enter Text Directly")
679
+
680
+ # Text area for input
681
+ user_input = st.text_area(
682
+ "Enter your message",
683
+ placeholder="E.g., 'I want to return my defective laptop purchased last week'",
684
+ height=100,
685
+ label_visibility="collapsed"
686
+ )
687
+
688
+ if user_input:
689
+ col1, col2, col3 = st.columns([1, 1, 2])
690
+
691
+ with col1:
692
+ if st.button("🚀 Process Text", use_container_width=True):
693
+ response = process_text_input(user_input, st.session_state.customer_id)
694
+ if response:
695
+ st.session_state.last_response = response
696
+ st.success("✅ Processing complete!")
697
+ st.rerun()
698
+
699
+ with col2:
700
+ if st.button("🔄 Clear", use_container_width=True):
701
+ st.rerun()
702
+
703
+ with col3:
704
+ st.caption("ℹ️ Processing may take 20-30 seconds")
705
+
706
+ # Display last response if available
707
+ st.divider()
708
+
709
+ if st.session_state.last_response:
710
+ st.subheader("📊 Latest Results")
711
+ display_response_results(st.session_state.last_response)
712
+
713
+ # Conversation history
714
+ st.divider()
715
+
716
+ with st.expander("📜 Conversation History"):
717
+ if st.session_state.history:
718
+ for i, record in enumerate(st.session_state.history, 1):
719
+ with st.container(border=True):
720
+ col1, col2, col3, col4 = st.columns(4)
721
+ with col1:
722
+ st.caption(f"Time: {record['timestamp'][:16]}")
723
+ with col2:
724
+ st.caption(f"Customer: {record['customer_id']}")
725
+ with col3:
726
+ st.caption(f"Intent: {record['intent']}")
727
+ with col4:
728
+ st.caption(f"Sentiment: {record['sentiment']}")
729
+ st.write(record['response'][:150] + "..." if len(record['response']) > 150 else record['response'])
730
+ else:
731
+ st.info("No conversation history yet")
732
+
733
+ # Footer
734
+ st.divider()
735
+ st.markdown("""
736
+ ---
737
+ **Voice RAG Bot** | Powered by Groq LLM, Qdrant Vector DB, and LangGraph Orchestration
738
+ For technical support, refer to the backend logs at `backend/main.py`
739
+ """)
orchestration/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Voice RAG Bot Orchestration Package"""
orchestration/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (214 Bytes). View file
 
orchestration/__pycache__/langgraph_workflow.cpython-311.pyc ADDED
Binary file (7.47 kB). View file
 
orchestration/__pycache__/latency_tracker.cpython-311.pyc ADDED
Binary file (6.98 kB). View file
 
orchestration/__pycache__/state.cpython-311.pyc ADDED
Binary file (2.82 kB). View file
 
orchestration/langgraph_workflow.py ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """LangGraph Workflow - 9-node orchestration pipeline"""
2
+
3
+ from langgraph.graph import StateGraph, END, START
4
+ from orchestration.state import ConversationState
5
+ from typing import Any, Dict
6
+ import logging
7
+ from orchestration.latency_tracker import get_tracker, reset_tracker
8
+
9
+ # Import all nodes
10
+ from orchestration.nodes.sentiment_hybrid import sentiment_analysis_hybrid as sentiment_analysis_node
11
+ from orchestration.nodes.entity_extraction import entity_extraction_node
12
+ from orchestration.nodes.intent_detection import intent_detection_node
13
+ from orchestration.nodes.retrieval_router import retrieval_router_node
14
+ from orchestration.nodes.context_builder import context_builder_node
15
+ from orchestration.nodes.response_generation import response_generation_node
16
+ from orchestration.nodes.validation import validation_node
17
+ from orchestration.nodes.memory_persistence import memory_persistence_node
18
+ from orchestration.nodes.tts_generation import tts_generation_node
19
+
20
+ logger = logging.getLogger(__name__)
21
+
22
+ def build_workflow() -> StateGraph:
23
+ workflow = StateGraph(ConversationState)
24
+
25
+ workflow.add_node("sentiment_analysis", sentiment_analysis_node)
26
+ workflow.add_node("entity_extraction", entity_extraction_node)
27
+ workflow.add_node("intent_detection", intent_detection_node)
28
+ workflow.add_node("retrieval_router", retrieval_router_node)
29
+ workflow.add_node("context_builder", context_builder_node)
30
+ workflow.add_node("response_generation", response_generation_node)
31
+ workflow.add_node("validation", validation_node)
32
+ workflow.add_node("memory_persistence", memory_persistence_node)
33
+ workflow.add_node("tts_generation", tts_generation_node)
34
+
35
+ workflow.add_edge(START, "sentiment_analysis")
36
+ workflow.add_edge(START, "entity_extraction")
37
+ workflow.add_edge("sentiment_analysis", "intent_detection")
38
+ workflow.add_edge("entity_extraction", "intent_detection")
39
+ workflow.add_edge("intent_detection", "retrieval_router")
40
+ workflow.add_edge("retrieval_router", "context_builder")
41
+ workflow.add_edge("context_builder", "response_generation")
42
+ workflow.add_edge("response_generation", "validation")
43
+
44
+ def should_regenerate(state: ConversationState) -> str:
45
+ return "memory_persistence" if state.get("validation_passed", False) else "response_generation"
46
+
47
+ workflow.add_conditional_edges("validation", should_regenerate, {"memory_persistence": "memory_persistence", "response_generation": "response_generation"})
48
+ workflow.add_edge("memory_persistence", "tts_generation")
49
+ workflow.add_edge("tts_generation", END)
50
+
51
+ return workflow
52
+
53
+
54
+ # Compile the workflow
55
+ workflow = build_workflow()
56
+ compiled_workflow = workflow.compile()
57
+
58
+
59
+ async def run_workflow(user_input: str, customer_id: str) -> Dict[str, Any]:
60
+ """
61
+ Execute the complete workflow
62
+
63
+ Args:
64
+ user_input: Text from STT (user's speech converted to text)
65
+ customer_id: Unique customer identifier
66
+
67
+ Returns:
68
+ Complete state with response, audio path, and metadata
69
+ """
70
+
71
+ try:
72
+ # Reset and start tracking
73
+ reset_tracker()
74
+ tracker = get_tracker()
75
+ tracker.start_total()
76
+ tracker.start("workflow_orchestration")
77
+
78
+ # Initialize state
79
+ initial_state: ConversationState = {
80
+ "user_input": user_input,
81
+ "customer_id": customer_id,
82
+ "intent": {"intent": "unknown", "confidence": 0.0},
83
+ "sentiment": {"label": "NEUTRAL", "score": 0.5},
84
+ "entities": None,
85
+ "conversation_summary": "",
86
+ "kb_context": "",
87
+ "history_context": "",
88
+ "response": "",
89
+ "validation_passed": False,
90
+ "final_audio_path": None
91
+ }
92
+
93
+ # Run workflow
94
+ final_state = await compiled_workflow.ainvoke(initial_state)
95
+
96
+ tracker.end("workflow_orchestration")
97
+
98
+ # Save and print results
99
+ latency_results = tracker.save_to_file()
100
+ tracker.print_summary()
101
+
102
+ # Convert to regular dict and add latency info
103
+ result_dict = dict(final_state)
104
+ result_dict["latency_metrics"] = latency_results
105
+
106
+ logger.info(f"Total workflow time: {latency_results['total_time_ms']} ms")
107
+
108
+ return result_dict
109
+
110
+ except Exception as e:
111
+ logger.error(f"Workflow execution failed: {str(e)}")
112
+ logger.error(f"Error type: {type(e).__name__}")
113
+ import traceback
114
+ logger.error(traceback.format_exc())
115
+ raise
116
+
117
+
118
+ def get_workflow_graph():
119
+ return compiled_workflow.get_graph().draw_mermaid()
orchestration/latency_tracker.py ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Latency Tracker - Track execution time for each module
3
+ """
4
+
5
+ import time
6
+ import json
7
+ from pathlib import Path
8
+ from typing import Dict, Any
9
+ from datetime import datetime
10
+
11
+
12
+ class LatencyTracker:
13
+ """Track latency for each processing module"""
14
+
15
+ def __init__(self):
16
+ self.timings: Dict[str, float] = {}
17
+ self.start_times: Dict[str, float] = {}
18
+ self.total_start = None
19
+
20
+ def start_total(self):
21
+ """Start tracking total execution time"""
22
+ self.total_start = time.time()
23
+
24
+ def start(self, module_name: str):
25
+ """Start timing a module"""
26
+ self.start_times[module_name] = time.time()
27
+
28
+ def end(self, module_name: str):
29
+ """End timing a module"""
30
+ if module_name in self.start_times:
31
+ elapsed = time.time() - self.start_times[module_name]
32
+ self.timings[module_name] = round(elapsed * 1000, 2) # Convert to ms
33
+ del self.start_times[module_name]
34
+
35
+ def get_results(self) -> Dict[str, Any]:
36
+ """Get all timing results"""
37
+ total_time = round((time.time() - self.total_start) * 1000, 2) if self.total_start else 0
38
+
39
+ return {
40
+ "timestamp": datetime.now().isoformat(),
41
+ "total_time_ms": total_time,
42
+ "modules": self.timings,
43
+ "breakdown_percent": self._calculate_percentages()
44
+ }
45
+
46
+ def _calculate_percentages(self) -> Dict[str, float]:
47
+ """Calculate percentage of total time for each module"""
48
+ total = sum(self.timings.values())
49
+ if total == 0:
50
+ return {}
51
+ return {
52
+ module: round((time_ms / total) * 100, 1)
53
+ for module, time_ms in self.timings.items()
54
+ }
55
+
56
+ def save_to_file(self, filepath: str = "data/latency_results.json"):
57
+ """Save results to JSON file"""
58
+ results = self.get_results()
59
+ path = Path(filepath)
60
+ path.parent.mkdir(parents=True, exist_ok=True)
61
+
62
+ # Append to existing results
63
+ existing = []
64
+ if path.exists():
65
+ try:
66
+ with open(path, 'r') as f:
67
+ loaded = json.load(f)
68
+ # Handle both list and dict formats for backward compatibility
69
+ if isinstance(loaded, list):
70
+ existing = loaded
71
+ elif isinstance(loaded, dict):
72
+ # If it's a dict, start fresh with a list
73
+ existing = []
74
+ else:
75
+ existing = []
76
+ except:
77
+ existing = []
78
+
79
+ existing.append(results)
80
+
81
+ # Keep only last 100 results
82
+ if len(existing) > 100:
83
+ existing = existing[-100:]
84
+
85
+ with open(path, 'w') as f:
86
+ json.dump(existing, f, indent=2)
87
+
88
+ return results
89
+
90
+ def print_summary(self):
91
+ """Print formatted summary"""
92
+ results = self.get_results()
93
+ print("\n" + "="*60)
94
+ print("LATENCY TRACKING RESULTS")
95
+ print("="*60)
96
+ print(f"Total Time: {results['total_time_ms']} ms")
97
+ print("\nModule Breakdown:")
98
+ print("-"*60)
99
+
100
+ for module, time_ms in results['modules'].items():
101
+ percent = results['breakdown_percent'].get(module, 0)
102
+ bar = "#" * int(percent / 2) # Visual bar
103
+ print(f"{module:25} {time_ms:8.2f} ms {percent:5.1f}% {bar}")
104
+
105
+ print("="*60 + "\n")
106
+
107
+
108
+ # Global tracker instance
109
+ _tracker = None
110
+
111
+
112
+ def get_tracker() -> LatencyTracker:
113
+ """Get or create global tracker instance"""
114
+ global _tracker
115
+ if _tracker is None:
116
+ _tracker = LatencyTracker()
117
+ return _tracker
118
+
119
+
120
+ def reset_tracker():
121
+ """Reset the global tracker"""
122
+ global _tracker
123
+ _tracker = LatencyTracker()
orchestration/nodes/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Voice RAG Bot Orchestration Nodes"""
orchestration/nodes/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (218 Bytes). View file
 
orchestration/nodes/__pycache__/context_builder.cpython-311.pyc ADDED
Binary file (2.12 kB). View file
 
orchestration/nodes/__pycache__/entity_extraction.cpython-311.pyc ADDED
Binary file (2.26 kB). View file
 
orchestration/nodes/__pycache__/intent_detection.cpython-311.pyc ADDED
Binary file (2.48 kB). View file
 
orchestration/nodes/__pycache__/memory_persistence.cpython-311.pyc ADDED
Binary file (1.79 kB). View file
 
orchestration/nodes/__pycache__/response_generation.cpython-311.pyc ADDED
Binary file (4.19 kB). View file
 
orchestration/nodes/__pycache__/retrieval_router.cpython-311.pyc ADDED
Binary file (2.97 kB). View file
 
orchestration/nodes/__pycache__/sentiment_analysis.cpython-311.pyc ADDED
Binary file (1.87 kB). View file
 
orchestration/nodes/__pycache__/sentiment_hybrid.cpython-311.pyc ADDED
Binary file (5.15 kB). View file
 
orchestration/nodes/__pycache__/tts_generation.cpython-311.pyc ADDED
Binary file (4.24 kB). View file
 
orchestration/nodes/__pycache__/validation.cpython-311.pyc ADDED
Binary file (3.03 kB). View file
 
orchestration/nodes/context_builder.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Context formatter for LLM prompts"""
2
+
3
+ from orchestration.state import ConversationState
4
+ from typing import Dict, Any
5
+ from orchestration.latency_tracker import get_tracker
6
+
7
+
8
+ def context_builder_node(state: ConversationState) -> Dict[str, Any]:
9
+ """
10
+ Build complete context string from all available information
11
+ for LLM to generate response
12
+
13
+ Combines:
14
+ - User input
15
+ - Intent & sentiment
16
+ - KB context
17
+ - Customer history
18
+ - Conversation summary
19
+
20
+ Returns:
21
+ State update (no new fields added, just confirmation)
22
+ """
23
+ tracker = get_tracker()
24
+ tracker.start("context_builder")
25
+
26
+ # Extract components
27
+ sentiment_label = state['sentiment']['label']
28
+ sentiment_score = state['sentiment']['score']
29
+ intent = state['intent']['intent']
30
+ kb_context = state['kb_context']
31
+ history_context = state['history_context']
32
+ conversation_summary = state['conversation_summary']
33
+ entities = state.get('entities', {})
34
+
35
+ # Build prompt context (this will be used by response_generation node)
36
+ # We just validate all components exist, they'll be used by next node
37
+
38
+ # Prepare formatted context for logging/debugging
39
+ formatted_context = f"""
40
+ === UNIFIED CONTEXT ===
41
+ User Intent: {intent}
42
+ User Sentiment: {sentiment_label} (confidence: {sentiment_score:.2f})
43
+
44
+ KB Context:
45
+ {kb_context}
46
+
47
+ Customer History:
48
+ {history_context}
49
+
50
+ Conversation Summary:
51
+ {conversation_summary}
52
+
53
+ Entities Detected:
54
+ {entities}
55
+ """
56
+
57
+ tracker.end("context_builder")
58
+ # Return minimal state update - context is already in state
59
+ # Next node (response_generation) will use these state fields directly
60
+ return {}
orchestration/nodes/entity_extraction.py ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Named entity extraction using BERT-NER"""
2
+
3
+ from transformers import pipeline
4
+ from orchestration.state import ConversationState
5
+ from typing import Dict, Any
6
+ from orchestration.latency_tracker import get_tracker
7
+
8
+
9
+ # Global model cache
10
+ _ner_model = None
11
+
12
+
13
+ def get_ner_model():
14
+ """Load NER model once and cache"""
15
+ global _ner_model
16
+ if _ner_model is None:
17
+ _ner_model = pipeline(
18
+ "token-classification",
19
+ model="dslim/bert-base-NER",
20
+ aggregation_strategy="simple"
21
+ )
22
+ return _ner_model
23
+
24
+
25
+ def entity_extraction_node(state: ConversationState) -> Dict[str, Any]:
26
+ """
27
+ Extract named entities from user input
28
+ Uses token classification model to identify entity types
29
+
30
+ Returns:
31
+ state update with entities field
32
+ """
33
+ tracker = get_tracker()
34
+ tracker.start("entity_extraction")
35
+
36
+ try:
37
+ ner_pipeline = get_ner_model()
38
+
39
+ # Extract entities
40
+ entities_raw = ner_pipeline(state['user_input'])
41
+
42
+ # Format entities as dict with types
43
+ entities_dict = {}
44
+ for entity in entities_raw:
45
+ entity_type = entity['entity_group']
46
+ if entity_type not in entities_dict:
47
+ entities_dict[entity_type] = []
48
+ entities_dict[entity_type].append(entity['word'])
49
+
50
+ # Return formatted entities
51
+ tracker.end("entity_extraction")
52
+ if entities_dict:
53
+ return {"entities": entities_dict}
54
+ else:
55
+ return {"entities": None}
56
+ except Exception as e:
57
+ tracker.end("entity_extraction")
58
+ return {"entities": None}
orchestration/nodes/intent_detection.py ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Intent classification using Groq LLM"""
2
+
3
+ from langchain_groq import ChatGroq
4
+ from langchain_core.prompts import PromptTemplate
5
+ from orchestration.state import ConversationState
6
+ from typing import Dict, Any
7
+ import json
8
+ from backend.config import settings
9
+ from orchestration.latency_tracker import get_tracker
10
+
11
+
12
+ def intent_detection_node(state: ConversationState) -> Dict[str, Any]:
13
+ """
14
+ Detect user intent using Groq LLM
15
+
16
+ Returns:
17
+ state update with intent field:
18
+ {"intent": {"intent": "...", "confidence": float}}
19
+ """
20
+ tracker = get_tracker()
21
+ tracker.start("intent_detection")
22
+
23
+ # Initialize Groq LLM
24
+ llm = ChatGroq(
25
+ model=settings.groq_model,
26
+ temperature=0.3, # Low temp for consistent intent detection
27
+ groq_api_key=settings.groq_api_key
28
+ )
29
+
30
+ # Prompt template for intent detection
31
+ intent_prompt = PromptTemplate(
32
+ input_variables=["user_input"],
33
+ template="""Analyze the user's input and determine their intent. Respond ONLY with JSON.
34
+
35
+ User Input: {user_input}
36
+
37
+ Possible intents: complaint, refund_request, inquiry, account_issue, escalation, billing, product_question, order_status, other
38
+
39
+ Response format:
40
+ {{
41
+ "intent": "<selected_intent>",
42
+ "confidence": <0.0-1.0>
43
+ }}"""
44
+ )
45
+
46
+ # Generate intent using chain pattern
47
+ chain = intent_prompt | llm
48
+ response = chain.invoke({"user_input": state['user_input']})
49
+
50
+ try:
51
+ # Parse JSON response from LLM
52
+ intent_data = json.loads(response.content.strip())
53
+ except json.JSONDecodeError:
54
+ # Fallback if JSON parsing fails
55
+ intent_data = {
56
+ "intent": "other",
57
+ "confidence": 0.5
58
+ }
59
+
60
+ tracker.end("intent_detection")
61
+ return {"intent": intent_data}
orchestration/nodes/memory_persistence.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Conversation storage to Qdrant customer history"""
2
+
3
+ from orchestration.state import ConversationState
4
+ from rag.qdrant_manager import qdrant_manager
5
+ from typing import Dict, Any
6
+ from orchestration.latency_tracker import get_tracker
7
+
8
+
9
+ def memory_persistence_node(state: ConversationState) -> Dict[str, Any]:
10
+ """
11
+ Store conversation turn to customer history collection in Qdrant
12
+ Enables historical context retrieval for repeat customers
13
+
14
+ Stores:
15
+ - Customer ID (for filtering)
16
+ - User input + response
17
+ - Intent classification
18
+ - Sentiment
19
+ - Timestamp
20
+
21
+ Returns:
22
+ state update (minimal, side-effect is primary)
23
+ """
24
+ tracker = get_tracker()
25
+ tracker.start("memory_persistence")
26
+
27
+ # Combine user input and response for storage
28
+ conversation_text = f"User: {state['user_input']}\nAssistant: {state['response']}"
29
+
30
+ # Determine interaction type from intent for categorization
31
+ intent = state['intent']['intent']
32
+ interaction_type = intent
33
+
34
+ # Store to Qdrant customer history
35
+ qdrant_manager.add_to_history(
36
+ customer_id=state['customer_id'],
37
+ text=conversation_text,
38
+ interaction_type=interaction_type
39
+ )
40
+
41
+ # Update conversation summary in memory (every 5 turns)
42
+ # For now, just store the current exchange
43
+
44
+ tracker.end("memory_persistence")
45
+ return {}
orchestration/nodes/response_generation.py ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Response generation using Groq LLM"""
2
+
3
+ from langchain_groq import ChatGroq
4
+ from langchain_core.prompts import PromptTemplate
5
+ from orchestration.state import ConversationState
6
+ from typing import Dict, Any
7
+ import logging
8
+ from backend.config import settings
9
+ from orchestration.latency_tracker import get_tracker
10
+
11
+ logger = logging.getLogger(__name__)
12
+
13
+
14
+ def response_generation_node(state: ConversationState) -> Dict[str, Any]:
15
+ """
16
+ Generate final response using Groq LLM
17
+ Incorporates KB context, customer history, intent, and sentiment
18
+
19
+ Returns:
20
+ state update with response field
21
+ """
22
+ tracker = get_tracker()
23
+ tracker.start("response_generation")
24
+
25
+ try:
26
+ logger.info("Response Generation: Initializing Groq LLM...")
27
+
28
+ # Initialize Groq LLM
29
+ llm = ChatGroq(
30
+ model=settings.groq_model,
31
+ temperature=settings.groq_temperature,
32
+ max_tokens=settings.groq_max_tokens,
33
+ groq_api_key=settings.groq_api_key
34
+ )
35
+
36
+ # Determine tone based on sentiment
37
+ sentiment_label = state['sentiment']['label']
38
+ tone_instruction = {
39
+ "POSITIVE": "Use a friendly, upbeat tone.",
40
+ "NEGATIVE": "Use an empathetic, understanding tone. Acknowledge frustration.",
41
+ "NEUTRAL": "Use a professional, helpful tone."
42
+ }.get(sentiment_label, "Use a professional tone.")
43
+
44
+ # Build response prompt
45
+ response_prompt = PromptTemplate(
46
+ input_variables=[
47
+ "user_input",
48
+ "intent",
49
+ "kb_context",
50
+ "history_context",
51
+ "tone_instruction"
52
+ ],
53
+ template="""You are a helpful customer service AI assistant.
54
+
55
+ User Intent: {intent}
56
+ {tone_instruction}
57
+
58
+ Knowledge Base Context:
59
+ {kb_context}
60
+
61
+ Customer History:
62
+ {history_context}
63
+
64
+ User Message: {user_input}
65
+
66
+ Provide a helpful, accurate response based on the context above. Keep response concise (2-3 sentences).
67
+ If you don't have relevant information, say so clearly."""
68
+ )
69
+
70
+ # Generate response using chain pattern
71
+ logger.info("🤖 Response Generation: Invoking LLM chain...")
72
+ chain = response_prompt | llm
73
+ response = chain.invoke({
74
+ "user_input": state['user_input'],
75
+ "intent": state['intent']['intent'],
76
+ "kb_context": state['kb_context'],
77
+ "history_context": state['history_context'],
78
+ "tone_instruction": tone_instruction
79
+ })
80
+
81
+ response_text = response.content.strip()
82
+ logger.info(f"✅ Response generated: '{response_text[:80]}...'")
83
+
84
+ tracker.end("response_generation")
85
+ return {"response": response_text}
86
+
87
+ except Exception as e:
88
+ logger.error(f"❌ Response generation failed: {type(e).__name__}: {str(e)}")
89
+ import traceback
90
+ logger.error(traceback.format_exc())
91
+ tracker.end("response_generation")
92
+ # Return fallback response
93
+ return {"response": "I apologize, but I encountered an error processing your request. Please try again."}
orchestration/nodes/retrieval_router.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Dual RAG routing - knowledge base + customer history"""
2
+
3
+ from orchestration.state import ConversationState
4
+ from rag.qdrant_manager import qdrant_manager
5
+ from typing import Dict, Any
6
+ from orchestration.latency_tracker import get_tracker
7
+
8
+
9
+ def retrieval_router_node(state: ConversationState) -> Dict[str, Any]:
10
+ """
11
+ Dual RAG retrieval strategy:
12
+ 1. Always search knowledge base for relevant policies/docs
13
+ 2. For specific intents (complaint, refund, escalation), also search customer history
14
+
15
+ Returns:
16
+ state update with kb_context and history_context
17
+ """
18
+ tracker = get_tracker()
19
+ tracker.start("retrieval_router")
20
+
21
+ user_input = state['user_input']
22
+ customer_id = state['customer_id']
23
+ intent = state['intent']['intent'] # Intent classification result
24
+
25
+ # Always retrieve from knowledge base
26
+ kb_results = qdrant_manager.search_kb(user_input, limit=3)
27
+ kb_context = "\n".join([
28
+ f"- [{r['source']}] {r['text']} (relevance: {r['score']:.2f})"
29
+ for r in kb_results
30
+ ])
31
+
32
+ # Conditionally retrieve from customer history
33
+ history_context = ""
34
+ history_intents = ["complaint", "refund_request", "escalation", "billing_inquiry", "billing", "negative_sentiment"]
35
+
36
+ if intent in history_intents or state['sentiment']['label'] == "NEGATIVE":
37
+ history_results = qdrant_manager.search_history(
38
+ user_input,
39
+ customer_id,
40
+ limit=3
41
+ )
42
+ history_context = "\n".join([
43
+ f"- [{r['interaction_type']}] {r['text']} (relevance: {r['score']:.2f})"
44
+ for r in history_results
45
+ ])
46
+
47
+ tracker.end("retrieval_router")
48
+ return {
49
+ "kb_context": kb_context if kb_results else "No relevant policies found.",
50
+ "history_context": history_context if history_context else "No customer history available."
51
+ }
orchestration/nodes/sentiment_analysis.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Sentiment Analysis Node - Emotion detection using DistilBERT
3
+ Analyzes user input sentiment for tone-aware response generation
4
+ """
5
+
6
+ from transformers import pipeline
7
+ from orchestration.state import ConversationState
8
+ from typing import Dict, Any
9
+
10
+
11
+ # Global model cache
12
+ _sentiment_model = None
13
+
14
+
15
+ def get_sentiment_model():
16
+ """Load model once and cache"""
17
+ global _sentiment_model
18
+ if _sentiment_model is None:
19
+ _sentiment_model = pipeline(
20
+ "sentiment-analysis",
21
+ model="distilbert-base-uncased-finetuned-sst-2-english"
22
+ )
23
+ return _sentiment_model
24
+
25
+
26
+ def sentiment_analysis_node(state: ConversationState) -> Dict[str, Any]:
27
+ """
28
+ Analyze sentiment of user input using DistilBERT
29
+
30
+ Returns:
31
+ state update with sentiment field populated:
32
+ {"sentiment": {"label": "POSITIVE|NEGATIVE|NEUTRAL", "score": float}}
33
+ """
34
+ try:
35
+ # Use cached model
36
+ sentiment_pipeline = get_sentiment_model()
37
+
38
+ # Analyze ONLY user input
39
+ result = sentiment_pipeline(state['user_input'])[0]
40
+
41
+ sentiment = {
42
+ "label": result['label'].upper(), # POSITIVE, NEGATIVE, or NEUTRAL
43
+ "score": result['score']
44
+ }
45
+
46
+ return {"sentiment": sentiment}
47
+ except Exception as e:
48
+ # Default to neutral on error
49
+ return {"sentiment": {"label": "NEUTRAL", "score": 0.5}}
orchestration/nodes/sentiment_hybrid.py ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Hybrid sentiment classifier - keyword-based + model fallback"""
2
+
3
+ from orchestration.state import ConversationState
4
+ from typing import Dict, Any
5
+ from transformers import pipeline
6
+ from orchestration.latency_tracker import get_tracker
7
+
8
+
9
+ # Global model cache
10
+ _sentiment_model = None
11
+
12
+
13
+ def get_sentiment_model():
14
+ """Load model once and cache"""
15
+ global _sentiment_model
16
+ if _sentiment_model is None:
17
+ _sentiment_model = pipeline(
18
+ "sentiment-analysis",
19
+ model="distilbert-base-uncased-finetuned-sst-2-english"
20
+ )
21
+ return _sentiment_model
22
+
23
+
24
+ def sentiment_analysis_hybrid(state: ConversationState) -> Dict[str, Any]:
25
+ """
26
+ Hybrid sentiment classification:
27
+ 1. Check FAQ keywords → NEUTRAL
28
+ 2. Check explicit sentiment words → POSITIVE/NEGATIVE
29
+ 3. Fall back to DistilBERT model
30
+
31
+ Returns:
32
+ {"sentiment": {"label": "POSITIVE|NEGATIVE|NEUTRAL", "score": 0.95}}
33
+ """
34
+ tracker = get_tracker()
35
+ tracker.start("sentiment_analysis")
36
+
37
+ user_input = state['user_input'].lower()
38
+
39
+ # Step 1: FAQ keywords → Always NEUTRAL (domain-specific)
40
+ faq_keywords = [
41
+ "policy", "return", "warranty", "shipping", "account",
42
+ "details", "information", "how", "what", "when", "where",
43
+ "can i", "do you", "tell me", "help", "need", "about"
44
+ ]
45
+
46
+ if any(kw in user_input for kw in faq_keywords):
47
+ # Still check for strong sentiment words within FAQ
48
+ strong_negative = ["frustrated", "angry", "hate", "terrible", "broken", "worst", "useless"]
49
+ strong_positive = ["thank", "love", "great", "excellent", "amazing", "perfect"]
50
+
51
+ if any(word in user_input for word in strong_negative):
52
+ tracker.end("sentiment_analysis")
53
+ return {
54
+ "sentiment": {
55
+ "label": "NEGATIVE",
56
+ "score": 0.95,
57
+ "reason": "Complaint with sentiment"
58
+ }
59
+ }
60
+ elif any(word in user_input for word in strong_positive):
61
+ tracker.end("sentiment_analysis")
62
+ return {
63
+ "sentiment": {
64
+ "label": "POSITIVE",
65
+ "score": 0.95,
66
+ "reason": "Praise with sentiment"
67
+ }
68
+ }
69
+ else:
70
+ # Pure FAQ question = NEUTRAL
71
+ tracker.end("sentiment_analysis")
72
+ return {
73
+ "sentiment": {
74
+ "label": "NEUTRAL",
75
+ "score": 0.99,
76
+ "reason": "FAQ inquiry"
77
+ }
78
+ }
79
+
80
+ # Step 2: Explicit strong sentiment words
81
+ strong_negative = [
82
+ "frustrated", "angry", "hate", "terrible", "broken",
83
+ "worst", "useless", "disaster", "awful", "horrible",
84
+ "unacceptable", "disgusted", "disappointed"
85
+ ]
86
+ strong_positive = [
87
+ "thank", "love", "great", "excellent", "amazing",
88
+ "perfect", "wonderful", "fantastic", "awesome", "impressed"
89
+ ]
90
+
91
+ if any(word in user_input for word in strong_negative):
92
+ tracker.end("sentiment_analysis")
93
+ return {
94
+ "sentiment": {
95
+ "label": "NEGATIVE",
96
+ "score": 0.95,
97
+ "reason": "Strong negative sentiment"
98
+ }
99
+ }
100
+
101
+ if any(word in user_input for word in strong_positive):
102
+ tracker.end("sentiment_analysis")
103
+ return {
104
+ "sentiment": {
105
+ "label": "POSITIVE",
106
+ "score": 0.95,
107
+ "reason": "Strong positive sentiment"
108
+ }
109
+ }
110
+
111
+ # Step 3: Fall back to DistilBERT model
112
+ try:
113
+ sentiment_pipeline = get_sentiment_model()
114
+ result = sentiment_pipeline(state['user_input'])[0]
115
+
116
+ tracker.end("sentiment_analysis")
117
+ return {
118
+ "sentiment": {
119
+ "label": result['label'].upper(),
120
+ "score": result['score'],
121
+ "reason": "Model inference"
122
+ }
123
+ }
124
+ except Exception as e:
125
+ # Default to neutral on error
126
+ tracker.end("sentiment_analysis")
127
+ return {
128
+ "sentiment": {
129
+ "label": "NEUTRAL",
130
+ "score": 0.5,
131
+ "reason": "Error - defaulting to neutral"
132
+ }
133
+ }
orchestration/nodes/tts_generation.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Text-to-speech generation using gTTS"""
2
+
3
+ from gtts import gTTS
4
+ from orchestration.state import ConversationState
5
+ from typing import Dict, Any
6
+ import os
7
+ import logging
8
+ from pathlib import Path
9
+ from orchestration.latency_tracker import get_tracker
10
+
11
+ logger = logging.getLogger(__name__)
12
+
13
+
14
+ def tts_generation_node(state: ConversationState) -> Dict[str, Any]:
15
+ """
16
+ Convert response text to speech using gTTS
17
+ Saves audio file and returns path
18
+
19
+ Returns:
20
+ state update with final_audio_path field
21
+ """
22
+ tracker = get_tracker()
23
+ tracker.start("tts_generation")
24
+
25
+ response_text = state.get('response', '')
26
+
27
+ # Validate response text exists
28
+ if not response_text or len(response_text.strip()) == 0:
29
+ logger.warning("⚠️ TTS: No response text to convert to speech")
30
+ tracker.end("tts_generation")
31
+ return {"final_audio_path": None}
32
+
33
+ # Create output directory if doesn't exist
34
+ audio_dir = Path("data/audio_output")
35
+ try:
36
+ audio_dir.mkdir(parents=True, exist_ok=True)
37
+ except Exception as e:
38
+ logger.error(f"❌ TTS: Failed to create audio directory: {e}")
39
+ tracker.end("tts_generation")
40
+ return {"final_audio_path": None}
41
+
42
+ # Generate unique filename
43
+ customer_id = state.get('customer_id', 'UNKNOWN')
44
+ import datetime
45
+ timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S_%f")[:19]
46
+ audio_filename = f"bot_response_{customer_id}_{timestamp}.mp3"
47
+ audio_path = audio_dir / audio_filename
48
+
49
+ try:
50
+ logger.info(f"📢 TTS: Generating audio for: '{response_text[:50]}...'")
51
+
52
+ # Generate TTS
53
+ tts = gTTS(text=response_text, lang='en', slow=False)
54
+ tts.save(str(audio_path))
55
+
56
+ # Verify file was created
57
+ if audio_path.exists():
58
+ file_size = audio_path.stat().st_size
59
+ logger.info(f"✅ TTS: Audio generated successfully ({file_size} bytes) -> {audio_path}")
60
+ final_audio_path = str(audio_path)
61
+ else:
62
+ logger.error(f"❌ TTS: File created but not found at {audio_path}")
63
+ final_audio_path = None
64
+
65
+ except Exception as e:
66
+ logger.error(f"❌ TTS generation failed: {type(e).__name__}: {str(e)}")
67
+ import traceback
68
+ logger.error(traceback.format_exc())
69
+ final_audio_path = None
70
+
71
+ tracker.end("tts_generation")
72
+ return {"final_audio_path": final_audio_path}
orchestration/nodes/validation.py ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Response quality validation"""
2
+
3
+ from orchestration.state import ConversationState
4
+ from typing import Dict, Any
5
+ import logging
6
+ from orchestration.latency_tracker import get_tracker
7
+
8
+ logger = logging.getLogger(__name__)
9
+
10
+
11
+ def validation_node(state: ConversationState) -> Dict[str, Any]:
12
+ """
13
+ Validate generated response against quality criteria:
14
+ 1. Length checks (not too short, not too long)
15
+ 2. Tone-sentiment consistency
16
+ 3. Basic sanity checks
17
+
18
+ Returns:
19
+ state update with validation_passed boolean
20
+ """
21
+ tracker = get_tracker()
22
+ tracker.start("validation")
23
+
24
+ response = state.get('response', '')
25
+ sentiment = state.get('sentiment', {}).get('label', 'NEUTRAL')
26
+
27
+ # Check 1: Response length (between 50-500 characters)
28
+ response_length = len(response)
29
+ length_valid = 50 <= response_length <= 500
30
+
31
+ # Check 2: Tone-sentiment consistency
32
+ tone_checks = {
33
+ "NEGATIVE": {
34
+ "forbidden_words": ["happy", "excellent", "amazing"],
35
+ "required_sentiment": ["understand", "apologize", "help"]
36
+ },
37
+ "POSITIVE": {
38
+ "forbidden_words": ["sorry", "problem", "issue"],
39
+ "required_sentiment": ["great", "happy", "enjoy"]
40
+ }
41
+ }
42
+
43
+ response_lower = response.lower()
44
+ tone_valid = True
45
+
46
+ if sentiment in tone_checks:
47
+ checks = tone_checks[sentiment]
48
+ # Check forbidden words aren't present
49
+ forbidden_present = any(word in response_lower for word in checks.get("forbidden_words", []))
50
+ tone_valid = not forbidden_present
51
+
52
+ # Check 3: Response not empty
53
+ content_valid = len(response.strip()) > 0
54
+
55
+ # Overall validation
56
+ validation_passed = length_valid and content_valid and tone_valid
57
+
58
+ logger.info(f"✓ Validation: length={response_length} ({length_valid}), content={content_valid}, tone={tone_valid} -> {'PASS' if validation_passed else 'FAIL'}")
59
+
60
+ tracker.end("validation")
61
+ return {"validation_passed": validation_passed}
orchestration/state.py ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ LangGraph State Definition - Central State Management for Voice RAG Bot Workflow
3
+ Defines all data flowing through the orchestration pipeline
4
+ """
5
+
6
+ from typing import TypedDict, List, Optional, Dict, Any
7
+
8
+
9
+ class ConversationState(TypedDict):
10
+ """
11
+ Complete state passed through LangGraph nodes
12
+
13
+ Fields:
14
+ - user_input: Original text from voice input (after STT)
15
+ - customer_id: Unique customer identifier for history tracking
16
+ - intent: Intent detection result with confidence score
17
+ - sentiment: Sentiment analysis result with label and confidence
18
+ - entities: Extracted entities from user input (optional)
19
+ - conversation_summary: LLM-generated summary of conversation
20
+ - kb_context: Retrieved context from knowledge base
21
+ - history_context: Retrieved context from customer history (persistent memory)
22
+ - response: Final LLM-generated response text
23
+ - validation_passed: Boolean flag for response validation
24
+ - final_audio_path: Path to generated TTS audio file
25
+ """
26
+
27
+ # Input & Context
28
+ user_input: str
29
+ customer_id: str
30
+
31
+ # NLP Analysis Results
32
+ intent: Dict[str, Any] # {"intent": "...", "confidence": float}
33
+ sentiment: Dict[str, Any] # {"label": "POSITIVE|NEGATIVE|NEUTRAL", "score": float}
34
+ entities: Optional[Dict[str, Any]] # {"entity_type": [...], ...}
35
+
36
+ # Memory Management
37
+ conversation_summary: str # LLM-generated summary
38
+
39
+ # RAG Contexts
40
+ kb_context: str # Knowledge base retrieval results
41
+ history_context: str # Customer history retrieval results
42
+
43
+ # Response Generation
44
+ response: str # Final LLM-generated response
45
+
46
+ # Validation & Output
47
+ validation_passed: bool
48
+ final_audio_path: Optional[str]
49
+
50
+
51
+ class ConversationStateOptional(TypedDict, total=False):
52
+ """
53
+ Optional version of ConversationState for partial updates
54
+ Allows nodes to update only the fields they produce
55
+ """
56
+
57
+ user_input: str
58
+ customer_id: str
59
+ intent: Dict[str, Any]
60
+ sentiment: Dict[str, Any]
61
+ entities: Optional[Dict[str, Any]]
62
+ conversation_summary: str
63
+ kb_context: str
64
+ history_context: str
65
+ response: str
66
+ validation_passed: bool
67
+ final_audio_path: Optional[str]
rag/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Voice RAG Bot RAG (Retrieval-Augmented Generation) Package"""
rag/__pycache__/__init__.cpython-311.pyc ADDED
Binary file (227 Bytes). View file
 
rag/__pycache__/cache_manager.cpython-311.pyc ADDED
Binary file (6.14 kB). View file