--- title: RAG Chatbot emoji: 🤖 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false --- # RAG Chatbot with Advanced Retrieval A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers. ## How It Works ### When You Upload a Document ``` 1. Upload File (PDF/DOCX/TXT) ↓ 2. Extract Text ↓ 3. Split into Chunks (512 tokens each) ↓ 4. Convert to Embeddings (384D vectors) ↓ 5. Store in Vector Database (Qdrant) ↓ 6. Save Metadata in MongoDB ``` **What happens:** Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching. ### When You Ask a Question ``` 1. Type Your Question ↓ 2. Check Cache (answered before?) ↓ 3. Search Documents (if RAG is ON) - BM25: Find keyword matches - Vector: Find similar meanings ↓ 4. Rerank Results (pick top 5 most relevant) ↓ 5. Build Context from Chunks ↓ 6. Generate Answer with LLM ↓ 7. Stream Response to You ``` **What happens:** The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context. ## Key Components ### Document Processing **DocumentProcessor** - Main coordinator for document uploads - Validates file type and size - Calls the right loader for PDF, DOCX, or TXT files - Manages the entire processing pipeline **Embedder** - Converts text to vectors - Uses FastEmbed with BAAI/bge-small-en-v1.5 model - Generates 384-dimensional vectors for semantic search - Each chunk becomes a searchable vector **Qdrant Vector Store** - Stores embeddings - Fast similarity search across millions of vectors - Returns most relevant chunks for any query - Handles all vector operations ### Question Answering **HybridRetriever** - Finds relevant information - **BM25**: Traditional keyword search (good for exact matches) - **Vector Search**: Semantic search (understands meaning) - Combines both for better results **Reranker** - Improves search quality - Uses FlashRank model to score relevance - Filters the best 5 chunks from 20 candidates - Ensures only the most relevant context is used **Generator** - Creates answers - Uses Groq LLM (llama-3.1-70b) - Streams responses in real-time - Bases answers on retrieved context when RAG is ON - Uses general knowledge when RAG is OFF **Semantic Cache** - Speeds up responses - Remembers previous questions and answers - Returns cached response if same question asked again - Separate caches for RAG ON vs RAG OFF ### Memory & Storage **Conversation Memory** - Remembers chat history - Stores last 10 messages in Redis - Enables follow-up questions - Each session has independent history **MongoDB** - Document metadata - Tracks uploaded documents - Stores file info, upload time, chunk count - Links to vectors in Qdrant **Redis** - Fast caching - Stores conversation history - Caches LLM responses - In-memory for instant access ## Technology Stack - **LangChain 0.3.13** - RAG framework - **Groq API** - Fast LLM (llama-3.1-70b) - **FastEmbed** - Embedding generation - **FlashRank** - Result reranking - **Qdrant** - Vector database - **MongoDB** - Document storage - **Redis** - Caching layer - **FastAPI** - Web framework ## Quick Start ### Installation ```bash # Clone and install git clone https://github.com/Abeshith/RAG.git cd RAG pip install -r requirements.txt ``` ### Configuration Create `.env` file: ```env GROQ_API_KEY=your_groq_key MONGODB_URI=your_mongodb_uri REDIS_URL=your_redis_url QDRANT_URL=your_qdrant_url QDRANT_API_KEY=your_qdrant_key JWT_SECRET_KEY=your_secret_key ``` ### Run ```bash uvicorn app.main:app --host 0.0.0.0 --port 7860 ``` Open: http://localhost:7860 ## Usage 1. **Upload Documents**: Click upload, select PDF/DOCX/TXT file 2. **Ask Questions**: Type question in chat box 3. **Toggle RAG**: - ON = answers from your documents - OFF = general knowledge answers 4. **View Sources**: See which document chunks were used ## API Endpoints ``` GET /health/ - Check system status POST /chat/stream - Send question, get streaming answer POST /documents/upload - Upload new document GET /documents/ - List all documents GET /documents/stats - Get document statistics DELETE /documents/{id} - Delete specific document ``` ## Docker Deployment ```bash docker build -t rag-chatbot . docker run -p 7860:7860 --env-file .env rag-chatbot ```