Spaces:
Runtime error
Runtime error
| title: Voice RAG Bot | |
| emoji: ποΈ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: docker | |
| pinned: false | |
| # Voice RAG Bot | |
| A voice-enabled RAG (Retrieval Augmented Generation) bot. | |
| ## π Quick Overview | |
| Voice RAG Bot is an intelligent AI customer support system that: | |
| - π€ **Accepts voice input** via microphone or audio file upload | |
| - π§ **Processes with LLM** (Groq) for intent detection and response generation | |
| - π **Retrieves relevant context** from knowledge base and customer history using vector search | |
| - π **Analyzes sentiment** to provide empathetic, sentiment-aware responses | |
| - π **Generates speech output** via text-to-speech | |
| - π **Orchestrates 9-node workflow** using LangGraph | |
| **Tech Stack**: Faster Whisper (STT) β LangGraph (9 nodes) β Groq LLM β Qdrant (Vector DB) β gTTS (TTS) | |
| --- | |
| ## π Quick Start (3 Steps) | |
| ### Step 1: Prerequisites | |
| - Docker Desktop running (for Qdrant) | |
| - Python 3.11+ | |
| - Git (optional) | |
| ### Step 2: Start Qdrant (Vector Database) | |
| ```bash | |
| docker run -p 6333:6333 qdrant/qdrant:latest | |
| ``` | |
| Leave this running in background. β System will auto-create collections. | |
| ### Step 3: Start Voice RAG Bot | |
| ```bash | |
| cd d:\Voice RAG Bot\voice-rag-bot | |
| # Activate virtual environment | |
| .\venv\Scripts\Activate.ps1 | |
| # Run startup script (starts backend + Streamlit) | |
| .\START_SYSTEM.ps1 | |
| ``` | |
| **Or start services manually:** | |
| Terminal 1 (Backend): | |
| ```bash | |
| .\venv\Scripts\Activate.ps1 | |
| python backend/main.py | |
| # Runs on http://localhost:8000 | |
| ``` | |
| Terminal 2 (Frontend): | |
| ```bash | |
| .\venv\Scripts\Activate.ps1 | |
| streamlit run frontend/streamlit_app.py | |
| # Opens http://localhost:8501 | |
| ``` | |
| --- | |
| ## οΏ½ Docker Deployment | |
| ### Option A: Docker Compose (Recommended for Development) | |
| Start all services (Backend + Frontend + Qdrant + Redis): | |
| ```bash | |
| docker-compose up -d | |
| ``` | |
| **Access Points:** | |
| - π€ Frontend: http://localhost:8501 | |
| - βοΈ Backend: http://localhost:8000 | |
| - π Qdrant: http://localhost:6333 | |
| - πΎ Redis: localhost:6379 | |
| **Stop Services:** | |
| ```bash | |
| docker-compose down | |
| ``` | |
| ### Option B: Individual Docker Images | |
| **Build Image:** | |
| ```bash | |
| docker build -t voice-rag-bot:latest . | |
| ``` | |
| **Run Backend:** | |
| ```bash | |
| docker run -p 8000:8000 \ | |
| -e APP_TYPE=backend \ | |
| -e GROQ_API_KEY=your_key \ | |
| -e QDRANT_URL=http://localhost:6333 \ | |
| voice-rag-bot:latest | |
| ``` | |
| **Run Frontend:** | |
| ```bash | |
| docker run -p 8501:8501 \ | |
| -e APP_TYPE=frontend \ | |
| -e GROQ_API_KEY=your_key \ | |
| -e QDRANT_URL=http://localhost:6333 \ | |
| voice-rag-bot:latest | |
| ``` | |
| --- | |
| ## π GitHub Actions CI/CD | |
| ### Setup GitHub Secrets | |
| Add these secrets to your GitHub repository (Settings β Secrets and Variables β Actions): | |
| | Secret Name | Value | Description | | |
| |------------|-------|-------------| | |
| | `GROQ_API_KEY` | `gsk_xxxxxxxxxxxx` | Groq API key for LLM | | |
| | `HF_USERNAME` | `your_username` | HuggingFace username | | |
| | `HF_TOKEN` | `hf_xxxxxxxxxxxx` | HuggingFace access token | | |
| | `HF_SPACE_REPO` | `username/voice-rag-bot` | HF Spaces repo path | | |
| **How to Add Secrets:** | |
| 1. Go to GitHub repository β Settings | |
| 2. Click "Secrets and variables" β "Actions" | |
| 3. Click "New repository secret" | |
| 4. Add each secret with name and value | |
| ### Automatic Deployment | |
| The workflow (`.github/workflows/docker-build.yml`) automatically: | |
| 1. **On `main` branch push:** | |
| - Builds Docker image | |
| - Pushes to GitHub Container Registry (GHCR) | |
| - Deploys to HuggingFace Spaces | |
| - Generates tags: `main`, `latest`, `sha-xxxxx` | |
| 2. **On Pull Request:** | |
| - Builds Docker image (no push) | |
| - Validates Dockerfile syntax | |
| - Tests image build | |
| **Workflow File:** | |
| - Location: `.github/workflows/docker-build.yml` | |
| - Triggers: Push to `main`/`develop`, Pull requests | |
| - Status: View in GitHub β Actions tab | |
| **Access Docker Images:** | |
| ```bash | |
| docker pull ghcr.io/your-username/voice-rag-bot:latest | |
| docker pull ghcr.io/your-username/voice-rag-bot:main | |
| ``` | |
| --- | |
| ## π€ HuggingFace Spaces Deployment | |
| ### Option A: Automatic Deployment (Via GitHub Actions) | |
| 1. Create HuggingFace Space: https://huggingface.co/spaces | |
| - Name: `voice-rag-bot` | |
| - License: OpenRAIL | |
| - Private/Public: Your choice | |
| 2. Get HF credentials: | |
| - Username: Your HF account name | |
| - Token: https://huggingface.co/settings/tokens (create "write" token) | |
| 3. Add GitHub Secrets (see above): | |
| - `HF_USERNAME` | |
| - `HF_TOKEN` | |
| - `HF_SPACE_REPO` = `username/voice-rag-bot` | |
| 4. **Push to main branch β Automatic deployment!** | |
| ### Option B: Manual Deployment to HF Spaces | |
| 1. **Create HF Space (if not exists):** | |
| ```bash | |
| huggingface-cli repo create voice-rag-bot --type space --space-sdk streamlit | |
| ``` | |
| 2. **Clone & Push:** | |
| ```bash | |
| git clone https://huggingface.co/spaces/your-username/voice-rag-bot | |
| cd voice-rag-bot | |
| # Add your project files | |
| cp -r /path/to/voice-rag-bot/* . | |
| # Push to HF Spaces | |
| git add . | |
| git commit -m "Deploy Voice RAG Bot" | |
| git push origin main | |
| ``` | |
| 3. **Configure Secrets in HF Spaces:** | |
| - Go to Space Settings β Variables and secrets | |
| - Add: `GROQ_API_KEY`, `QDRANT_URL`, etc. | |
| 4. **App File:** `app.py` (automatically created) | |
| ### HF Spaces Configuration (`spaces.yaml`) | |
| ```yaml | |
| title: Voice RAG Bot | |
| description: Voice-enabled RAG chatbot | |
| app_file: app.py | |
| sdk: streamlit | |
| sdk_version: "1.28.0" | |
| python_version: "3.11" | |
| cpu: true | |
| gpu: true | |
| startup_duration_timeout: 600 | |
| ``` | |
| ### HF Spaces Requirements | |
| **Note:** HuggingFace Spaces runs Streamlit frontend only (no backend microservices). | |
| **Options:** | |
| 1. **Use External Backend:** | |
| - Deploy backend separately (Railway, Render, Heroku) | |
| - Update `BACKEND_URL` in Streamlit config | |
| - Spaces frontend connects to external backend | |
| 2. **Self-contained (Frontend Only):** | |
| - Remove backend API calls | |
| - Use Streamlit session state for data | |
| - Limited functionality (no vector DB, LLM caching) | |
| 3. **Docker-based Space (Advanced):** | |
| - Deploy full stack in Docker container | |
| - Requires HF Spaces Docker runtime | |
| - Use `Dockerfile` + `docker-compose.yml` | |
| **Recommended:** Use external FastAPI backend on Render/Railway + Streamlit on HF Spaces | |
| --- | |
| ## π§ Environment Variables for Deployment | |
| ### Local Development | |
| ``` | |
| GROQ_API_KEY=gsk_xxxxxxxxxxxx | |
| QDRANT_URL=http://localhost:6333 | |
| DEBUG=True | |
| LOG_LEVEL=INFO | |
| ``` | |
| ### Docker Compose | |
| ``` | |
| GROQ_API_KEY=gsk_xxxxxxxxxxxx | |
| QDRANT_URL=http://qdrant:6333 | |
| BACKEND_URL=http://backend:8000 | |
| DEBUG=False | |
| LOG_LEVEL=INFO | |
| ``` | |
| ### HuggingFace Spaces | |
| ``` | |
| GROQ_API_KEY=gsk_xxxxxxxxxxxx | |
| BACKEND_URL=https://your-backend-api.herokuapp.com | |
| FRONTEND_MODE=SPACES | |
| ``` | |
| ### GitHub Actions (Auto-set) | |
| - `REGISTRY`: ghcr.io | |
| - `IMAGE_NAME`: ${{ github.repository }} | |
| - Secrets: See above | |
| --- | |
| ## οΏ½π Usage Guide | |
| ### Via Streamlit Frontend (Recommended) | |
| 1. **Open Browser**: http://localhost:8501 | |
| 2. **Enter Customer ID**: Unique identifier for customer (enables history tracking) | |
| 3. **Choose Input Method**: | |
| - **Option A**: Click π€ **Record** β Speak your message β **Process Audio** | |
| - **Option B**: Upload audio file (MP3/WAV) | |
| - **Option C**: Type message directly in text area | |
| 4. **View Results** (automatically displayed): | |
| - π Generated Response | |
| - π― Detected Intent (+ confidence) | |
| - π Sentiment Analysis (+ confidence) | |
| - π·οΈ Extracted Entities | |
| - π Knowledge Base context (if relevant) | |
| - π Customer History (if relevant) | |
| - π Audio playback of response | |
| ### Via REST API (For Integration) | |
| **Process Audio:** | |
| ```bash | |
| curl -X POST "http://localhost:8000/process-audio?customer_id=CUST_001" \ | |
| -F "file=@voice_message.wav" | |
| ``` | |
| **Process Text:** | |
| ```bash | |
| curl -X POST "http://localhost:8000/process-text" \ | |
| -d "user_input=I want to return my laptop&customer_id=CUST_001" | |
| ``` | |
| **Health Check:** | |
| ```bash | |
| curl http://localhost:8000/health | |
| ``` | |
| --- | |
| ## π System Architecture | |
| ``` | |
| Input Layer | |
| ββ π€ Audio Input (Streamlit st.audio_input) | |
| ββ π Text Input (Streamlit text area) | |
| β | |
| Speech-to-Text | |
| ββ Faster Whisper (base model, CPU inference) | |
| β | |
| Orchestration Layer (LangGraph - 9 Nodes) | |
| 1. sentiment_analysis (DistilBERT) | |
| 2. entity_extraction (BERT-base-NER) | |
| 3. intent_detection (Groq LLM) | |
| 4. retrieval_router (Qdrant search) | |
| 5. context_builder (Format prompt) | |
| 6. response_generation (Groq LLM) | |
| 7. validation (Hallucination checks) | |
| 8. memory_persistence (Qdrant upsert) | |
| 9. tts_generation (gTTS) | |
| β | |
| Output Layer | |
| ββ π Text Response | |
| ββ π Sentiment-aware Tone | |
| ββ π Audio File (MP3) | |
| ββ π― Intent Classification | |
| ``` | |
| --- | |
| ## π§ Configuration | |
| **Environment Variables** (`.env`): | |
| ``` | |
| GROQ_API_KEY=your_groq_api_key_here | |
| QDRANT_URL=http://localhost:6333 | |
| BACKEND_URL=http://localhost:8000 | |
| VECTOR_DIMENSION=1024 | |
| EMBEDDING_MODEL=BAAI/bge-m3 | |
| GROQ_MODEL=openai/gpt-oss-20b | |
| KB_COLLECTION_NAME=knowledge_base | |
| HISTORY_COLLECTION_NAME=customer_history | |
| WHISPER_MODEL=base | |
| ``` | |
| --- | |
| ## π Sample Data | |
| Load sample data (4 KB documents + 4 customer history records): | |
| ```bash | |
| .\venv\Scripts\Activate.ps1 | |
| python data/load_sample_data.py | |
| ``` | |
| **Included Data:** | |
| - KB Documents: Return Policy, Shipping Info, Warranty Info, Account Management | |
| - Customer History: 4 interactions (complaints, refunds, inquiries) | |
| --- | |
| ## π§ͺ Testing | |
| ### Quick Verification | |
| ```bash | |
| # Test complete pipeline (end-to-end) | |
| .\venv\Scripts\Activate.ps1 | |
| python tests/test_full_integration.py | |
| ``` | |
| **Expected Output**: β FULL INTEGRATION TEST PASSED | |
| ### Component Status | |
| - β All 9 nodes connected and working | |
| - β FastAPI endpoints operational | |
| - β Qdrant vector search functional | |
| - β LLM integration responding | |
| - β Audio processing working | |
| - β Sample data loadable | |
| --- | |
| ## π― Intent Types Supported | |
| | Intent | Example | Response | | |
| |--------|---------|----------| | |
| | `refund_request` | "I want to return this" | Empathetic, processing info | | |
| | `order_status` | "Where's my order?" | Tracking info | | |
| | `product_inquiry` | "Tell me about...?" | Product details | | |
| | `billing_issue` | "My charge was wrong" | Empathetic, billing process | | |
| | `warranty_claim` | "Product broke" | Warranty eligibility info | | |
| | `account_management` | "Change my password" | Account instructions | | |
| | `general_support` | "How do I...?" | General assistance | | |
| | `complaint` | "This is unacceptable" | Empathetic, resolution steps | | |
| | `other` | Misc questions | General help | | |
| --- | |
| ## π Response Quality Factors | |
| 1. **Sentiment Detection**: POSITIVE/NEGATIVE/NEUTRAL classification | |
| 2. **Confidence Scores**: 0-1 for both intent and sentiment | |
| 3. **Context Retrieval**: Up to 3 KB documents + customer history | |
| 4. **Tone Matching**: Empathetic for negative, professional for neutral, friendly for positive | |
| 5. **Hallucination Prevention**: Validation layer checks for accuracy | |
| --- | |
| ## π Troubleshooting | |
| ### Issue: "Backend Not Connected" | |
| **Solution**: Ensure FastAPI backend is running | |
| ```bash | |
| python backend/main.py | |
| ``` | |
| ### Issue: "Qdrant Connection Error" | |
| **Solution**: Start Qdrant Docker container | |
| ```bash | |
| docker run -p 6333:6333 qdrant/qdrant:latest | |
| ``` | |
| ### Issue: "Groq API Error" | |
| **Solution**: Check GROQ_API_KEY in `.env` file | |
| ```bash | |
| # Verify key is set | |
| echo $env:GROQ_API_KEY | |
| ``` | |
| ### Issue: "Audio Processing Timeout" | |
| **Solution**: Processing may take 30-60 seconds for audio | |
| - First run downloads models (Whisper, BGE-M3, DistilBERT) | |
| - Subsequent runs are faster | |
| - Ensure sufficient disk space (~5GB) | |
| ### Issue: "Module Not Found" | |
| **Solution**: Reinstall dependencies | |
| ```bash | |
| .\venv\Scripts\Activate.ps1 | |
| pip install -r requirements.txt | |
| ``` | |
| --- | |
| ## π Project Structure | |
| ``` | |
| d:\Voice RAG Bot\voice-rag-bot\ | |
| βββ backend/ | |
| β βββ main.py FastAPI server | |
| β βββ config.py Configuration | |
| βββ frontend/ | |
| β βββ streamlit_app.py Web UI | |
| βββ orchestration/ | |
| β βββ langgraph_workflow.py 9-node workflow | |
| β βββ state.py State management | |
| β βββ nodes/ Individual nodes | |
| β βββ sentiment_analysis.py | |
| β βββ entity_extraction.py | |
| β βββ intent_detection.py | |
| β βββ retrieval_router.py | |
| β βββ context_builder.py | |
| β βββ response_generation.py | |
| β βββ validation.py | |
| β βββ memory_persistence.py | |
| β βββ tts_generation.py | |
| βββ rag/ | |
| β βββ qdrant_manager.py Vector DB client | |
| β βββ embedding_manager.py BGE-M3 embeddings | |
| βββ data/ | |
| β βββ load_sample_data.py Sample data loader | |
| β βββ audio_output/ Generated audio files | |
| βββ tests/ | |
| β βββ test_full_integration.py End-to-end test | |
| βββ .env Configuration | |
| βββ requirements.txt Dependencies | |
| βββ START_SYSTEM.ps1 Quick start script | |
| βββ venv/ Python environment | |
| ``` | |
| --- | |
| ## π Workflow Execution (Behind the Scenes) | |
| 1. **sentiment_analysis**: Input β DistilBERT β POSITIVE/NEGATIVE/NEUTRAL | |
| 2. **entity_extraction**: Input β BERT-NER β Extract names, locations, etc. | |
| 3. **intent_detection**: Input β Groq LLM β 9-intent classification | |
| 4. **retrieval_router**: Intent β Qdrant search β 3 KB docs + customer history | |
| 5. **context_builder**: Format contexts β Unified prompt | |
| 6. **response_generation**: Prompt β Groq LLM β Response text | |
| 7. **validation**: Check hallucinations β Retry if needed | |
| 8. **memory_persistence**: Embed response β Upsert to Qdrant | |
| 9. **tts_generation**: Response text β gTTS β MP3 audio file | |
| --- | |
| ## π Performance Metrics (Approximate) | |
| | Component | Time | Notes | | |
| |-----------|------|-------| | |
| | STT (Audio β Text) | 5-15s | Depends on audio length | | |
| | Sentiment Analysis | 0.5s | DistilBERT inference | | |
| | Entity Extraction | 0.5s | BERT-NER inference | | |
| | Intent Detection | 1-2s | Groq API call | | |
| | KB Search | 0.2s | Qdrant vector search | | |
| | Response Generation | 2-5s | Groq streaming | | |
| | Validation | 0.5s | Local checks | | |
| | TTS Generation | 2-5s | gTTS processing | | |
| | **Total End-to-End** | **12-35s** | First run slower (model loading) | | |
| --- | |
| ## π‘ Tips & Tricks | |
| ### Faster Processing | |
| - Use text input instead of audio (skips STT) | |
| - System caches models after first run | |
| - Keep audio messages under 30 seconds | |
| ### Better Responses | |
| - Use clear, grammatically correct input | |
| - Provide context ("purchased last week" vs "bought before") | |
| - Specify what you need (return, refund, replacement) | |
| ### Debugging | |
| - Check `backend/main.py` logs for errors | |
| - View Qdrant collections: http://localhost:6333/api/swagger/index.html | |
| - Monitor Streamlit server in terminal for issues | |
| --- | |
| ## π Next Steps | |
| 1. **Load Sample Data**: `python data/load_sample_data.py` | |
| 2. **Test with Demo Scenarios**: Use Streamlit to test various intents | |
| 3. **Customize KB Documents**: Add your own documents to Qdrant | |
| 4. **Fine-tune Prompts**: Edit prompts in `prompts/` directory | |
| 5. **Production Deployment**: Add authentication, rate limiting, monitoring | |
| --- | |
| ## π Support & References | |
| **Documentation Files:** | |
| - `data/DATA_REQUIREMENTS.md` - Data schema documentation | |
| - `.env` - Environment configuration | |
| **API Endpoints:** | |
| - `POST /process-audio` - Audio input endpoint | |
| - `POST /process-text` - Text input endpoint | |
| - `GET /health` - Health check | |
| **Backend Logs:** | |
| - Location: Console output when running `python backend/main.py` | |
| - Check for errors, model loading, API calls | |
| --- | |
| ## π License & Attribution | |
| **Components**: | |
| - **Groq LLM**: Free tier, gpt-oss-20b model | |
| - **Faster Whisper**: OpenAI (MIT License) | |
| - **LangGraph**: LangChain (Open Source) | |
| - **Qdrant**: Open source vector database | |
| - **BGE-M3**: BAAI embeddings model | |
| - **DistilBERT**: Hugging Face transformers | |
| - **gTTS**: Google Text-to-Speech | |
| --- | |
| ## β Verification Checklist | |
| Before considering system "ready for production": | |
| - [ ] Backend running on http://localhost:8000 | |
| - [ ] Qdrant running on http://localhost:6333 | |
| - [ ] Streamlit frontend accessible at http://localhost:8501 | |
| - [ ] Sample data loaded (`python data/load_sample_data.py`) | |
| - [ ] Integration test passing (`python tests/test_full_integration.py`) | |
| - [ ] Audio input working (record or upload) | |
| - [ ] All 9 nodes executing (check logs) | |
| - [ ] Response generation working | |
| - [ ] Audio playback working | |
| - [ ] History tracking working (multiple messages same customer) | |
| --- | |
| **Built with β€οΈ | Last Updated: May 30, 2026** | |