Spaces:
Sleeping
Sleeping
| title: RAG Chatbot | |
| emoji: 🤖 | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| # RAG Chatbot with Advanced Retrieval | |
| A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers. | |
| ## How It Works | |
| ### When You Upload a Document | |
| ``` | |
| 1. Upload File (PDF/DOCX/TXT) | |
| ↓ | |
| 2. Extract Text | |
| ↓ | |
| 3. Split into Chunks (512 tokens each) | |
| ↓ | |
| 4. Convert to Embeddings (384D vectors) | |
| ↓ | |
| 5. Store in Vector Database (Qdrant) | |
| ↓ | |
| 6. Save Metadata in MongoDB | |
| ``` | |
| **What happens:** Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching. | |
| ### When You Ask a Question | |
| ``` | |
| 1. Type Your Question | |
| ↓ | |
| 2. Check Cache (answered before?) | |
| ↓ | |
| 3. Search Documents (if RAG is ON) | |
| - BM25: Find keyword matches | |
| - Vector: Find similar meanings | |
| ↓ | |
| 4. Rerank Results (pick top 5 most relevant) | |
| ↓ | |
| 5. Build Context from Chunks | |
| ↓ | |
| 6. Generate Answer with LLM | |
| ↓ | |
| 7. Stream Response to You | |
| ``` | |
| **What happens:** The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context. | |
| ## Key Components | |
| ### Document Processing | |
| **DocumentProcessor** - Main coordinator for document uploads | |
| - Validates file type and size | |
| - Calls the right loader for PDF, DOCX, or TXT files | |
| - Manages the entire processing pipeline | |
| **Embedder** - Converts text to vectors | |
| - Uses FastEmbed with BAAI/bge-small-en-v1.5 model | |
| - Generates 384-dimensional vectors for semantic search | |
| - Each chunk becomes a searchable vector | |
| **Qdrant Vector Store** - Stores embeddings | |
| - Fast similarity search across millions of vectors | |
| - Returns most relevant chunks for any query | |
| - Handles all vector operations | |
| ### Question Answering | |
| **HybridRetriever** - Finds relevant information | |
| - **BM25**: Traditional keyword search (good for exact matches) | |
| - **Vector Search**: Semantic search (understands meaning) | |
| - Combines both for better results | |
| **Reranker** - Improves search quality | |
| - Uses FlashRank model to score relevance | |
| - Filters the best 5 chunks from 20 candidates | |
| - Ensures only the most relevant context is used | |
| **Generator** - Creates answers | |
| - Uses Groq LLM (llama-3.1-70b) | |
| - Streams responses in real-time | |
| - Bases answers on retrieved context when RAG is ON | |
| - Uses general knowledge when RAG is OFF | |
| **Semantic Cache** - Speeds up responses | |
| - Remembers previous questions and answers | |
| - Returns cached response if same question asked again | |
| - Separate caches for RAG ON vs RAG OFF | |
| ### Memory & Storage | |
| **Conversation Memory** - Remembers chat history | |
| - Stores last 10 messages in Redis | |
| - Enables follow-up questions | |
| - Each session has independent history | |
| **MongoDB** - Document metadata | |
| - Tracks uploaded documents | |
| - Stores file info, upload time, chunk count | |
| - Links to vectors in Qdrant | |
| **Redis** - Fast caching | |
| - Stores conversation history | |
| - Caches LLM responses | |
| - In-memory for instant access | |
| ## Technology Stack | |
| - **LangChain 0.3.13** - RAG framework | |
| - **Groq API** - Fast LLM (llama-3.1-70b) | |
| - **FastEmbed** - Embedding generation | |
| - **FlashRank** - Result reranking | |
| - **Qdrant** - Vector database | |
| - **MongoDB** - Document storage | |
| - **Redis** - Caching layer | |
| - **FastAPI** - Web framework | |
| ## Quick Start | |
| ### Installation | |
| ```bash | |
| # Clone and install | |
| git clone https://github.com/Abeshith/RAG.git | |
| cd RAG | |
| pip install -r requirements.txt | |
| ``` | |
| ### Configuration | |
| Create `.env` file: | |
| ```env | |
| GROQ_API_KEY=your_groq_key | |
| MONGODB_URI=your_mongodb_uri | |
| REDIS_URL=your_redis_url | |
| QDRANT_URL=your_qdrant_url | |
| QDRANT_API_KEY=your_qdrant_key | |
| JWT_SECRET_KEY=your_secret_key | |
| ``` | |
| ### Run | |
| ```bash | |
| uvicorn app.main:app --host 0.0.0.0 --port 7860 | |
| ``` | |
| Open: http://localhost:7860 | |
| ## Usage | |
| 1. **Upload Documents**: Click upload, select PDF/DOCX/TXT file | |
| 2. **Ask Questions**: Type question in chat box | |
| 3. **Toggle RAG**: | |
| - ON = answers from your documents | |
| - OFF = general knowledge answers | |
| 4. **View Sources**: See which document chunks were used | |
| ## API Endpoints | |
| ``` | |
| GET /health/ - Check system status | |
| POST /chat/stream - Send question, get streaming answer | |
| POST /documents/upload - Upload new document | |
| GET /documents/ - List all documents | |
| GET /documents/stats - Get document statistics | |
| DELETE /documents/{id} - Delete specific document | |
| ``` | |
| ## Docker Deployment | |
| ```bash | |
| docker build -t rag-chatbot . | |
| docker run -p 7860:7860 --env-file .env rag-chatbot | |
| ``` | |