rag-chatbot / README.md
Abeshith's picture
Simplify README with clear flow and user-friendly explanations
7c3a93a
---
title: RAG Chatbot
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
---
# RAG Chatbot with Advanced Retrieval
A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers.
## How It Works
### When You Upload a Document
```
1. Upload File (PDF/DOCX/TXT)
2. Extract Text
3. Split into Chunks (512 tokens each)
4. Convert to Embeddings (384D vectors)
5. Store in Vector Database (Qdrant)
6. Save Metadata in MongoDB
```
**What happens:** Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching.
### When You Ask a Question
```
1. Type Your Question
2. Check Cache (answered before?)
3. Search Documents (if RAG is ON)
- BM25: Find keyword matches
- Vector: Find similar meanings
4. Rerank Results (pick top 5 most relevant)
5. Build Context from Chunks
6. Generate Answer with LLM
7. Stream Response to You
```
**What happens:** The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context.
## Key Components
### Document Processing
**DocumentProcessor** - Main coordinator for document uploads
- Validates file type and size
- Calls the right loader for PDF, DOCX, or TXT files
- Manages the entire processing pipeline
**Embedder** - Converts text to vectors
- Uses FastEmbed with BAAI/bge-small-en-v1.5 model
- Generates 384-dimensional vectors for semantic search
- Each chunk becomes a searchable vector
**Qdrant Vector Store** - Stores embeddings
- Fast similarity search across millions of vectors
- Returns most relevant chunks for any query
- Handles all vector operations
### Question Answering
**HybridRetriever** - Finds relevant information
- **BM25**: Traditional keyword search (good for exact matches)
- **Vector Search**: Semantic search (understands meaning)
- Combines both for better results
**Reranker** - Improves search quality
- Uses FlashRank model to score relevance
- Filters the best 5 chunks from 20 candidates
- Ensures only the most relevant context is used
**Generator** - Creates answers
- Uses Groq LLM (llama-3.1-70b)
- Streams responses in real-time
- Bases answers on retrieved context when RAG is ON
- Uses general knowledge when RAG is OFF
**Semantic Cache** - Speeds up responses
- Remembers previous questions and answers
- Returns cached response if same question asked again
- Separate caches for RAG ON vs RAG OFF
### Memory & Storage
**Conversation Memory** - Remembers chat history
- Stores last 10 messages in Redis
- Enables follow-up questions
- Each session has independent history
**MongoDB** - Document metadata
- Tracks uploaded documents
- Stores file info, upload time, chunk count
- Links to vectors in Qdrant
**Redis** - Fast caching
- Stores conversation history
- Caches LLM responses
- In-memory for instant access
## Technology Stack
- **LangChain 0.3.13** - RAG framework
- **Groq API** - Fast LLM (llama-3.1-70b)
- **FastEmbed** - Embedding generation
- **FlashRank** - Result reranking
- **Qdrant** - Vector database
- **MongoDB** - Document storage
- **Redis** - Caching layer
- **FastAPI** - Web framework
## Quick Start
### Installation
```bash
# Clone and install
git clone https://github.com/Abeshith/RAG.git
cd RAG
pip install -r requirements.txt
```
### Configuration
Create `.env` file:
```env
GROQ_API_KEY=your_groq_key
MONGODB_URI=your_mongodb_uri
REDIS_URL=your_redis_url
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_key
JWT_SECRET_KEY=your_secret_key
```
### Run
```bash
uvicorn app.main:app --host 0.0.0.0 --port 7860
```
Open: http://localhost:7860
## Usage
1. **Upload Documents**: Click upload, select PDF/DOCX/TXT file
2. **Ask Questions**: Type question in chat box
3. **Toggle RAG**:
- ON = answers from your documents
- OFF = general knowledge answers
4. **View Sources**: See which document chunks were used
## API Endpoints
```
GET /health/ - Check system status
POST /chat/stream - Send question, get streaming answer
POST /documents/upload - Upload new document
GET /documents/ - List all documents
GET /documents/stats - Get document statistics
DELETE /documents/{id} - Delete specific document
```
## Docker Deployment
```bash
docker build -t rag-chatbot .
docker run -p 7860:7860 --env-file .env rag-chatbot
```