---
title: RAG Chatbot
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
---

# RAG Chatbot with Advanced Retrieval

A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers.

## How It Works

### When You Upload a Document

```
1. Upload File (PDF/DOCX/TXT)
        ↓
2. Extract Text
        ↓
3. Split into Chunks (512 tokens each)
        ↓
4. Convert to Embeddings (384D vectors)
        ↓
5. Store in Vector Database (Qdrant)
        ↓
6. Save Metadata in MongoDB
```

**What happens:** Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching.

### When You Ask a Question

```
1. Type Your Question
        ↓
2. Check Cache (answered before?)
        ↓
3. Search Documents (if RAG is ON)
   - BM25: Find keyword matches
   - Vector: Find similar meanings
        ↓
4. Rerank Results (pick top 5 most relevant)
        ↓
5. Build Context from Chunks
        ↓
6. Generate Answer with LLM
        ↓
7. Stream Response to You
```

**What happens:** The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context.

## Key Components

### Document Processing

**DocumentProcessor** - Main coordinator for document uploads
- Validates file type and size
- Calls the right loader for PDF, DOCX, or TXT files
- Manages the entire processing pipeline

**Embedder** - Converts text to vectors
- Uses FastEmbed with BAAI/bge-small-en-v1.5 model
- Generates 384-dimensional vectors for semantic search
- Each chunk becomes a searchable vector

**Qdrant Vector Store** - Stores embeddings
- Fast similarity search across millions of vectors
- Returns most relevant chunks for any query
- Handles all vector operations

### Question Answering

**HybridRetriever** - Finds relevant information
- **BM25**: Traditional keyword search (good for exact matches)
- **Vector Search**: Semantic search (understands meaning)
- Combines both for better results

**Reranker** - Improves search quality
- Uses FlashRank model to score relevance
- Filters the best 5 chunks from 20 candidates
- Ensures only the most relevant context is used

**Generator** - Creates answers
- Uses Groq LLM (llama-3.1-70b)
- Streams responses in real-time
- Bases answers on retrieved context when RAG is ON
- Uses general knowledge when RAG is OFF

**Semantic Cache** - Speeds up responses
- Remembers previous questions and answers
- Returns cached response if same question asked again
- Separate caches for RAG ON vs RAG OFF

### Memory & Storage

**Conversation Memory** - Remembers chat history
- Stores last 10 messages in Redis
- Enables follow-up questions
- Each session has independent history

**MongoDB** - Document metadata
- Tracks uploaded documents
- Stores file info, upload time, chunk count
- Links to vectors in Qdrant

**Redis** - Fast caching
- Stores conversation history
- Caches LLM responses
- In-memory for instant access

## Technology Stack

- **LangChain 0.3.13** - RAG framework
- **Groq API** - Fast LLM (llama-3.1-70b)
- **FastEmbed** - Embedding generation
- **FlashRank** - Result reranking
- **Qdrant** - Vector database
- **MongoDB** - Document storage
- **Redis** - Caching layer
- **FastAPI** - Web framework

## Quick Start

### Installation

```bash
# Clone and install
git clone https://github.com/Abeshith/RAG.git
cd RAG
pip install -r requirements.txt
```

### Configuration

Create `.env` file:

```env
GROQ_API_KEY=your_groq_key
MONGODB_URI=your_mongodb_uri
REDIS_URL=your_redis_url
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_key
JWT_SECRET_KEY=your_secret_key
```

### Run

```bash
uvicorn app.main:app --host 0.0.0.0 --port 7860
```

Open: http://localhost:7860

## Usage

1. **Upload Documents**: Click upload, select PDF/DOCX/TXT file
2. **Ask Questions**: Type question in chat box
3. **Toggle RAG**: 
   - ON = answers from your documents
   - OFF = general knowledge answers
4. **View Sources**: See which document chunks were used

## API Endpoints

```
GET  /health/                    - Check system status
POST /chat/stream                - Send question, get streaming answer
POST /documents/upload           - Upload new document
GET  /documents/                 - List all documents
GET  /documents/stats            - Get document statistics
DELETE /documents/{id}           - Delete specific document
```

## Docker Deployment

```bash
docker build -t rag-chatbot .
docker run -p 7860:7860 --env-file .env rag-chatbot
```