Abeshith commited on
Commit
7c3a93a
·
1 Parent(s): b66add6

Simplify README with clear flow and user-friendly explanations

Browse files
Files changed (1) hide show
  1. README.md +129 -164
README.md CHANGED
@@ -10,210 +10,175 @@ pinned: false
10
 
11
  # RAG Chatbot with Advanced Retrieval
12
 
13
- Enterprise-grade Retrieval-Augmented Generation (RAG) chatbot built with LangChain, FastAPI, and modern AI technologies.
14
 
15
- ## 🚀 Features
16
 
17
- - **Hybrid Retrieval**: Combines BM25 and vector search for optimal document retrieval
18
- - **Reranking**: FlashRank reranker for improved result quality
19
- - **Streaming Responses**: Real-time chat with Server-Sent Events (SSE)
20
- - **Conversation Memory**: Redis-backed chat history
21
- - **Smart Caching**: Semantic caching with RAG/non-RAG distinction
22
- - **Document Processing**: Support for PDF, DOCX, and TXT files
23
- - **Background Processing**: Celery workers for async document processing
24
- - **Real-time Updates**: MongoDB change streams for live notifications
25
- - **Vector Database**: Qdrant for scalable vector storage
26
-
27
- ## 🏗️ Architecture
28
 
29
  ```
30
- ├── app/ # Main application
31
- │ ├── api/ # FastAPI routes and middleware
32
- │ ├── core/ # RAG components (retriever, reranker, generator)
33
- │ ├── db/ # Database clients (MongoDB, Redis, Qdrant)
34
- │ ├── models/ # Pydantic schemas
35
- │ ├── services/ # MongoDB watcher
36
- │ ├── tasks/ # Celery background tasks
37
- │ └── utils/ # Utilities (logger, errors, prompts)
38
- ├── ingestion/ # Document processing pipeline
39
- ├── frontend/ # Web interface (HTML/CSS/JS)
40
- ├── config/ # YAML configurations
41
- ├── tests/ # Test suite
42
- └── prompts/ # LLM prompt templates
43
  ```
44
 
45
- ## 📦 Tech Stack
46
-
47
- - **Framework**: FastAPI + Uvicorn
48
- - **LLM**: Groq API (llama-3.1-70b)
49
- - **Embeddings**: FastEmbed (BAAI/bge-small-en-v1.5)
50
- - **Vector Store**: Qdrant Cloud
51
- - **Databases**: MongoDB Atlas, Redis Cloud
52
- - **Reranking**: FlashRank (ms-marco-MiniLM-L-12-v2)
53
- - **Background Jobs**: Celery
54
- - **LangChain**: Version 0.3.13 with LangGraph 0.2.58
55
-
56
- ## 🛠️ Installation
57
 
58
- ### Local Setup
59
 
60
- 1. **Clone the repository**
61
- ```bash
62
- git clone https://github.com/YOUR_USERNAME/rag-chatbot.git
63
- cd rag-chatbot
64
  ```
65
-
66
- 2. **Create virtual environment**
67
- ```bash
68
- python -m venv venv
69
- source venv/bin/activate # On Windows: venv\Scripts\activate
 
 
 
 
 
 
 
 
 
 
70
  ```
71
 
72
- 3. **Install dependencies**
73
- ```bash
74
- pip install -r requirements.txt
75
- ```
76
 
77
- 4. **Configure environment**
78
- Create a `.env` file in the root directory:
79
- ```env
80
- GROQ_API_KEY=your_groq_api_key
81
- QDRANT_API_KEY=your_qdrant_api_key
82
- REDIS_PASSWORD=your_redis_password
83
- ```
84
 
85
- 5. **Update configuration**
86
- Edit `config/database.yaml` with your MongoDB, Redis, and Qdrant URLs.
87
 
88
- 6. **Run the application**
89
- ```bash
90
- uvicorn app.main:app --host 0.0.0.0 --port 7860
91
- ```
92
 
93
- Visit `http://localhost:7860` to access the chat interface.
 
 
 
94
 
95
- ### Docker Setup
 
 
 
96
 
97
- 1. **Build and run with Docker Compose**
98
- ```bash
99
- docker-compose up -d
100
- ```
101
 
102
- 2. **View logs**
103
- ```bash
104
- docker-compose logs -f app
105
- ```
106
 
107
- 3. **Stop services**
108
- ```bash
109
- docker-compose down
110
- ```
111
 
112
- ## 🧪 Testing
 
 
 
 
113
 
114
- Run the test suite:
115
- ```bash
116
- pytest tests/ -v
117
- ```
118
 
119
- Run specific test categories:
120
- ```bash
121
- # Unit tests only
122
- pytest tests/ -m unit
123
 
124
- # Integration tests only
125
- pytest tests/ -m integration
 
 
126
 
127
- # Skip slow tests
128
- pytest tests/ -m "not slow"
129
- ```
 
130
 
131
- ## 🚀 Deployment
 
 
 
132
 
133
- ### Hugging Face Spaces
134
 
135
- 1. **Create a new Space** on [Hugging Face](https://huggingface.co/spaces)
136
- 2. **Select Docker SDK** as the space type
137
- 3. **Add secrets** in Space settings:
138
- - `GROQ_API_KEY`
139
- - `QDRANT_API_KEY`
140
- - `REDIS_PASSWORD`
141
- 4. **Push code** to the Space repository
142
- 5. **Automatic deployment** via GitHub Actions (see `.github/workflows/deploy.yml`)
143
 
144
- ### Manual Deployment
145
 
146
- ```bash
147
- # Build Docker image
148
- docker build -t rag-chatbot .
149
 
150
- # Run container
151
- docker run -p 7860:7860 \
152
- -e GROQ_API_KEY=your_key \
153
- -e QDRANT_API_KEY=your_key \
154
- -e REDIS_PASSWORD=your_password \
155
- rag-chatbot
156
  ```
157
 
158
- ## 📚 Usage
159
-
160
- ### Document Upload
161
-
162
- 1. Click "Upload Document" in the sidebar
163
- 2. Select a PDF, DOCX, or TXT file
164
- 3. Wait for processing (documents are chunked and embedded)
165
- 4. Document appears in the sidebar
166
-
167
- ### Chat
168
-
169
- 1. Toggle RAG on/off using the switch
170
- 2. Type your question in the input field
171
- 3. Press Enter or click Send
172
- 4. Receive streaming responses in real-time
173
-
174
- ### RAG vs Non-RAG
175
-
176
- - **RAG ON**: Answers based on your uploaded documents
177
- - **RAG OFF**: Answers from LLM's general knowledge
178
-
179
- ## 🔧 Configuration
180
-
181
- All configuration is in `config/*.yaml` files:
182
 
183
- - `app.yaml` - Server and upload settings
184
- - `database.yaml` - Database connections
185
- - `models.yaml` - LLM, embedding, reranker configs
186
- - `rag.yaml` - Retrieval and chunking parameters
187
- - `security.yaml` - CORS, rate limiting, JWT
188
- - `celery.yaml` - Background worker settings
189
- - `langchain.yaml` - LangChain tracing
190
 
191
- ## 🤝 Contributing
192
-
193
- Contributions are welcome! Please:
 
 
 
 
 
194
 
195
- 1. Fork the repository
196
- 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
197
- 3. Commit your changes (`git commit -m 'Add amazing feature'`)
198
- 4. Push to the branch (`git push origin feature/amazing-feature`)
199
- 5. Open a Pull Request
200
 
201
- ## 📄 License
 
 
202
 
203
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
204
 
205
- ## 🙏 Acknowledgments
206
 
207
- - LangChain for the RAG framework
208
- - Groq for fast LLM inference
209
- - Qdrant for vector storage
210
- - FlashRank for efficient reranking
211
- - FastEmbed for lightweight embeddings
 
212
 
213
- ## 📧 Contact
214
 
215
- For questions or support, please open an issue on GitHub.
 
 
 
 
 
 
 
216
 
217
- ---
218
 
219
- **Built with ❤️ using LangChain, FastAPI, and modern AI technologies**
 
 
 
 
10
 
11
  # RAG Chatbot with Advanced Retrieval
12
 
13
+ A question-answering system that lets you upload documents and ask questions about them. The system retrieves relevant information from your documents and generates accurate answers.
14
 
15
+ ## How It Works
16
 
17
+ ### When You Upload a Document
 
 
 
 
 
 
 
 
 
 
18
 
19
  ```
20
+ 1. Upload File (PDF/DOCX/TXT)
21
+
22
+ 2. Extract Text
23
+
24
+ 3. Split into Chunks (512 tokens each)
25
+
26
+ 4. Convert to Embeddings (384D vectors)
27
+
28
+ 5. Store in Vector Database (Qdrant)
29
+
30
+ 6. Save Metadata in MongoDB
 
 
31
  ```
32
 
33
+ **What happens:** Your document is broken into small chunks, each chunk is converted into a numerical vector that captures its meaning, and stored in a database for fast searching.
 
 
 
 
 
 
 
 
 
 
 
34
 
35
+ ### When You Ask a Question
36
 
 
 
 
 
37
  ```
38
+ 1. Type Your Question
39
+
40
+ 2. Check Cache (answered before?)
41
+
42
+ 3. Search Documents (if RAG is ON)
43
+ - BM25: Find keyword matches
44
+ - Vector: Find similar meanings
45
+
46
+ 4. Rerank Results (pick top 5 most relevant)
47
+
48
+ 5. Build Context from Chunks
49
+
50
+ 6. Generate Answer with LLM
51
+
52
+ 7. Stream Response to You
53
  ```
54
 
55
+ **What happens:** The system searches for relevant chunks from your documents, combines them as context, and uses an AI model to generate an answer based on that context.
 
 
 
56
 
57
+ ## Key Components
 
 
 
 
 
 
58
 
59
+ ### Document Processing
 
60
 
61
+ **DocumentProcessor** - Main coordinator for document uploads
62
+ - Validates file type and size
63
+ - Calls the right loader for PDF, DOCX, or TXT files
64
+ - Manages the entire processing pipeline
65
 
66
+ **Embedder** - Converts text to vectors
67
+ - Uses FastEmbed with BAAI/bge-small-en-v1.5 model
68
+ - Generates 384-dimensional vectors for semantic search
69
+ - Each chunk becomes a searchable vector
70
 
71
+ **Qdrant Vector Store** - Stores embeddings
72
+ - Fast similarity search across millions of vectors
73
+ - Returns most relevant chunks for any query
74
+ - Handles all vector operations
75
 
76
+ ### Question Answering
 
 
 
77
 
78
+ **HybridRetriever** - Finds relevant information
79
+ - **BM25**: Traditional keyword search (good for exact matches)
80
+ - **Vector Search**: Semantic search (understands meaning)
81
+ - Combines both for better results
82
 
83
+ **Reranker** - Improves search quality
84
+ - Uses FlashRank model to score relevance
85
+ - Filters the best 5 chunks from 20 candidates
86
+ - Ensures only the most relevant context is used
87
 
88
+ **Generator** - Creates answers
89
+ - Uses Groq LLM (llama-3.1-70b)
90
+ - Streams responses in real-time
91
+ - Bases answers on retrieved context when RAG is ON
92
+ - Uses general knowledge when RAG is OFF
93
 
94
+ **Semantic Cache** - Speeds up responses
95
+ - Remembers previous questions and answers
96
+ - Returns cached response if same question asked again
97
+ - Separate caches for RAG ON vs RAG OFF
98
 
99
+ ### Memory & Storage
 
 
 
100
 
101
+ **Conversation Memory** - Remembers chat history
102
+ - Stores last 10 messages in Redis
103
+ - Enables follow-up questions
104
+ - Each session has independent history
105
 
106
+ **MongoDB** - Document metadata
107
+ - Tracks uploaded documents
108
+ - Stores file info, upload time, chunk count
109
+ - Links to vectors in Qdrant
110
 
111
+ **Redis** - Fast caching
112
+ - Stores conversation history
113
+ - Caches LLM responses
114
+ - In-memory for instant access
115
 
116
+ ## Technology Stack
117
 
118
+ - **LangChain 0.3.13** - RAG framework
119
+ - **Groq API** - Fast LLM (llama-3.1-70b)
120
+ - **FastEmbed** - Embedding generation
121
+ - **FlashRank** - Result reranking
122
+ - **Qdrant** - Vector database
123
+ - **MongoDB** - Document storage
124
+ - **Redis** - Caching layer
125
+ - **FastAPI** - Web framework
126
 
127
+ ## Quick Start
128
 
129
+ ### Installation
 
 
130
 
131
+ ```bash
132
+ # Clone and install
133
+ git clone https://github.com/Abeshith/RAG.git
134
+ cd RAG
135
+ pip install -r requirements.txt
 
136
  ```
137
 
138
+ ### Configuration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
+ Create `.env` file:
 
 
 
 
 
 
141
 
142
+ ```env
143
+ GROQ_API_KEY=your_groq_key
144
+ MONGODB_URI=your_mongodb_uri
145
+ REDIS_URL=your_redis_url
146
+ QDRANT_URL=your_qdrant_url
147
+ QDRANT_API_KEY=your_qdrant_key
148
+ JWT_SECRET_KEY=your_secret_key
149
+ ```
150
 
151
+ ### Run
 
 
 
 
152
 
153
+ ```bash
154
+ uvicorn app.main:app --host 0.0.0.0 --port 7860
155
+ ```
156
 
157
+ Open: http://localhost:7860
158
 
159
+ ## Usage
160
 
161
+ 1. **Upload Documents**: Click upload, select PDF/DOCX/TXT file
162
+ 2. **Ask Questions**: Type question in chat box
163
+ 3. **Toggle RAG**:
164
+ - ON = answers from your documents
165
+ - OFF = general knowledge answers
166
+ 4. **View Sources**: See which document chunks were used
167
 
168
+ ## API Endpoints
169
 
170
+ ```
171
+ GET /health/ - Check system status
172
+ POST /chat/stream - Send question, get streaming answer
173
+ POST /documents/upload - Upload new document
174
+ GET /documents/ - List all documents
175
+ GET /documents/stats - Get document statistics
176
+ DELETE /documents/{id} - Delete specific document
177
+ ```
178
 
179
+ ## Docker Deployment
180
 
181
+ ```bash
182
+ docker build -t rag-chatbot .
183
+ docker run -p 7860:7860 --env-file .env rag-chatbot
184
+ ```