Spaces:
Sleeping
Sleeping
File size: 4,735 Bytes
cad96c2 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a 64d7fdf 7c3a93a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | ---
title: RAG Chatbot
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
---
# RAG Chatbot with Advanced Retrieval
A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers.
## How It Works
### When You Upload a Document
```
1. Upload File (PDF/DOCX/TXT)
β
2. Extract Text
β
3. Split into Chunks (512 tokens each)
β
4. Convert to Embeddings (384D vectors)
β
5. Store in Vector Database (Qdrant)
β
6. Save Metadata in MongoDB
```
**What happens:** Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching.
### When You Ask a Question
```
1. Type Your Question
β
2. Check Cache (answered before?)
β
3. Search Documents (if RAG is ON)
- BM25: Find keyword matches
- Vector: Find similar meanings
β
4. Rerank Results (pick top 5 most relevant)
β
5. Build Context from Chunks
β
6. Generate Answer with LLM
β
7. Stream Response to You
```
**What happens:** The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context.
## Key Components
### Document Processing
**DocumentProcessor** - Main coordinator for document uploads
- Validates file type and size
- Calls the right loader for PDF, DOCX, or TXT files
- Manages the entire processing pipeline
**Embedder** - Converts text to vectors
- Uses FastEmbed with BAAI/bge-small-en-v1.5 model
- Generates 384-dimensional vectors for semantic search
- Each chunk becomes a searchable vector
**Qdrant Vector Store** - Stores embeddings
- Fast similarity search across millions of vectors
- Returns most relevant chunks for any query
- Handles all vector operations
### Question Answering
**HybridRetriever** - Finds relevant information
- **BM25**: Traditional keyword search (good for exact matches)
- **Vector Search**: Semantic search (understands meaning)
- Combines both for better results
**Reranker** - Improves search quality
- Uses FlashRank model to score relevance
- Filters the best 5 chunks from 20 candidates
- Ensures only the most relevant context is used
**Generator** - Creates answers
- Uses Groq LLM (llama-3.1-70b)
- Streams responses in real-time
- Bases answers on retrieved context when RAG is ON
- Uses general knowledge when RAG is OFF
**Semantic Cache** - Speeds up responses
- Remembers previous questions and answers
- Returns cached response if same question asked again
- Separate caches for RAG ON vs RAG OFF
### Memory & Storage
**Conversation Memory** - Remembers chat history
- Stores last 10 messages in Redis
- Enables follow-up questions
- Each session has independent history
**MongoDB** - Document metadata
- Tracks uploaded documents
- Stores file info, upload time, chunk count
- Links to vectors in Qdrant
**Redis** - Fast caching
- Stores conversation history
- Caches LLM responses
- In-memory for instant access
## Technology Stack
- **LangChain 0.3.13** - RAG framework
- **Groq API** - Fast LLM (llama-3.1-70b)
- **FastEmbed** - Embedding generation
- **FlashRank** - Result reranking
- **Qdrant** - Vector database
- **MongoDB** - Document storage
- **Redis** - Caching layer
- **FastAPI** - Web framework
## Quick Start
### Installation
```bash
# Clone and install
git clone https://github.com/Abeshith/RAG.git
cd RAG
pip install -r requirements.txt
```
### Configuration
Create `.env` file:
```env
GROQ_API_KEY=your_groq_key
MONGODB_URI=your_mongodb_uri
REDIS_URL=your_redis_url
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_key
JWT_SECRET_KEY=your_secret_key
```
### Run
```bash
uvicorn app.main:app --host 0.0.0.0 --port 7860
```
Open: http://localhost:7860
## Usage
1. **Upload Documents**: Click upload, select PDF/DOCX/TXT file
2. **Ask Questions**: Type question in chat box
3. **Toggle RAG**:
- ON = answers from your documents
- OFF = general knowledge answers
4. **View Sources**: See which document chunks were used
## API Endpoints
```
GET /health/ - Check system status
POST /chat/stream - Send question, get streaming answer
POST /documents/upload - Upload new document
GET /documents/ - List all documents
GET /documents/stats - Get document statistics
DELETE /documents/{id} - Delete specific document
```
## Docker Deployment
```bash
docker build -t rag-chatbot .
docker run -p 7860:7860 --env-file .env rag-chatbot
```
|