riteshraut commited on
Commit
edea2d6
·
1 Parent(s): f3c5275

feat/included the re ranker

Browse files
Files changed (3) hide show
  1. README.md +499 -91
  2. app.py +61 -17
  3. rag_processor.py +1 -0
README.md CHANGED
@@ -1,142 +1,550 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: CogniChat - Chat with Your Documents
3
- emoji: 🤖
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: docker
7
- pinned: false
8
- license: mit
9
- app_port: 7860
 
 
 
 
 
 
 
10
  ---
11
 
12
- # CogniChat - Chat with Your Documents 🤖
13
 
14
- An intelligent document chat application that allows you to upload documents and have conversations with them using advanced RAG (Retrieval Augmented Generation) technology.
15
 
16
- ## Features
17
 
18
- - **Multi-format Document Support**: Upload PDF, TXT, DOCX, and image files
19
- - **Advanced RAG Pipeline**: Hybrid search with BM25 and FAISS retrievers
20
- - **Conversational Memory**: Maintains chat history for contextual conversations
21
- - **Text-to-Speech**: Listen to AI responses with built-in TTS
22
- - **Streaming Responses**: Real-time response generation
23
- - **Modern UI**: Clean, responsive interface with dark mode
24
 
25
- ## How to Use
 
 
 
 
 
26
 
27
- 1. Upload your documents using drag & drop or file selection
28
- 2. Wait for processing (may take a few minutes for large documents)
29
- 3. Start chatting with your documents!
30
- 4. Use the play button to listen to responses
31
 
32
- ## Technology Stack
33
 
34
- - **Backend**: Flask, LangChain, FAISS
35
- - **AI Models**: Groq API with Llama 3.1
36
- - **Embeddings**: HuggingFace all-miniLM-L6-v2
37
- - **Frontend**: Vanilla JavaScript, TailwindCSS
38
- - **Document Processing**: Unstructured, PyPDF, python-docx
39
 
40
- ## Quick Start
 
 
 
 
 
 
 
 
41
 
42
- ### 1. Set up Environment
43
 
44
- #### For Hugging Face Spaces:
45
- 1. Go to your Space Settings → Repository Secrets
46
- 2. Add a new secret:
47
- - **Name**: `GROQ_API_KEY`
48
- - **Value**: Your actual GROQ API key from [console.groq.com/keys](https://console.groq.com/keys)
49
- 3. Restart your Space after adding the secret
50
 
51
- #### For Local Development:
52
- ```bash
53
- # Copy the environment template
54
- cp .env.example .env
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
- # Edit .env and add your GROQ API key
57
- # Replace 'your_groq_api_key_here' with your actual GROQ API key
58
- # Get your API key from: https://console.groq.com/keys
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  ```
60
 
61
- **Important**: The API key must be set correctly for the chat functionality to work!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
- ### 2. Run with Docker (Recommended)
64
  ```bash
65
- # Build the Docker image
66
- docker build -t cognichat .
 
67
 
68
- # Run the container
 
 
 
 
 
 
 
 
 
 
69
  docker run -p 7860:7860 --env-file .env cognichat
70
  ```
71
 
72
- ### 3. Run Locally
 
73
  ```bash
 
 
 
 
 
 
 
 
74
  # Install dependencies
75
  pip install -r requirements.txt
76
 
77
- # Set your GROQ API key
78
- export GROQ_API_KEY=your_groq_api_key_here
79
 
80
  # Run the application
81
  python app.py
82
  ```
83
 
84
- Visit `http://localhost:7860` to use the application.
85
 
86
- ## Recent Fixes (October 2025)
 
 
 
 
87
 
88
- **Fixed Docker Permission Issues**:
89
- - Resolved cache directory permission problems
90
- - Application now runs as non-root user for security
91
- - Improved error handling and fallback mechanisms
92
 
93
- **Fixed HF Spaces Upload Directory Issue**:
94
- - **CRITICAL**: Changed upload folder to `/tmp/uploads` for HF Spaces compatibility
95
- - Automatically detects HF Spaces environment and uses writable directories
96
- - Added comprehensive error handling for file save operations
97
- - Fixed 400 chat errors caused by read-only directory access
98
 
99
- ✅ **Enhanced Model Loading**:
100
- - Multiple fallback strategies for embedding model initialization
101
- - Better cache management for HuggingFace models
102
- - Improved startup reliability
103
 
104
- ## Troubleshooting
105
 
106
- ### Permission Errors
107
- If you encounter permission errors, ensure:
108
- 1. Docker containers run with proper user permissions
109
- 2. Cache directories are writable
110
- 3. Environment variables are set correctly
111
 
112
- ### Model Loading Issues
113
- The app includes multiple fallback mechanisms:
114
- 1. Primary: `sentence-transformers/all-miniLM-L6-v2`
115
- 2. Fallback: `all-miniLM-L6-v2`
116
- 3. Final fallback: Default model without cache specification
117
 
118
- ### API Key Issues
119
- Make sure your GROQ API key is:
120
- 1. Valid and active
121
- 2. Set in the `.env` file or environment variables
122
- 3. Has sufficient credits/quota
 
 
 
 
 
123
 
124
- ## Development
125
 
126
- For development and testing:
127
  ```bash
128
- # Test embedding model loading
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
  python test_embeddings.py
130
 
131
- # Run with debug mode
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
132
  export FLASK_DEBUG=1
 
133
  python app.py
134
  ```
135
 
136
- ## Environment Variables
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
- - `GROQ_API_KEY`: Your Groq API key (required)
139
- - `HF_HOME`: HuggingFace cache directory
140
- - `PORT`: Application port (default: 7860)
141
 
142
- Developed by [Ritesh](https://github.com/RautRitesh) and [Alish-0x](https://github.com/Alish-0x)
 
1
+ # 🤖 CogniChat - Intelligent Document Chat System
2
+
3
+ <div align="center">
4
+
5
+ ![License](https://img.shields.io/badge/license-MIT-blue.svg)
6
+ ![Python](https://img.shields.io/badge/python-3.9+-brightgreen.svg)
7
+ ![Docker](https://img.shields.io/badge/docker-ready-blue.svg)
8
+ ![HuggingFace](https://img.shields.io/badge/🤗-Spaces-yellow.svg)
9
+
10
+ **Transform your documents into interactive conversations powered by advanced RAG technology**
11
+
12
+ [Features](#-features) • [Quick Start](#-quick-start) • [Architecture](#-architecture) • [Deployment](#-deployment) • [API](#-api-reference)
13
+
14
+ </div>
15
+
16
  ---
17
+
18
+ ## 📋 Table of Contents
19
+
20
+ - [Overview](#-overview)
21
+ - [Features](#-features)
22
+ - [Architecture](#-architecture)
23
+ - [Technology Stack](#-technology-stack)
24
+ - [Quick Start](#-quick-start)
25
+ - [Deployment](#-deployment)
26
+ - [Configuration](#-configuration)
27
+ - [API Reference](#-api-reference)
28
+ - [Troubleshooting](#-troubleshooting)
29
+ - [Contributing](#-contributing)
30
+ - [License](#-license)
31
+
32
  ---
33
 
34
+ ## 🎯 Overview
35
 
36
+ CogniChat is a production-ready, intelligent document chat application that leverages **Retrieval Augmented Generation (RAG)** to enable natural conversations with your documents. Built with enterprise-grade technologies, it provides accurate, context-aware responses from your document corpus.
37
 
38
+ ### Why CogniChat?
39
 
 
 
 
 
 
 
40
 
41
+ - **🔉 Audio Overview of Your document**:Simply ask the question and listen the audio. Now your document can speak with you.
42
+ - **🎯 Accurate Retrieval**: Hybrid search combining BM25 and FAISS for optimal results
43
+ - **💬 Conversational Memory**: Maintains context across multiple interactions
44
+ - **📄 Multi-Format Support**: Handles PDF, DOCX, TXT, and image files
45
+ - **🚀 Production Ready**: Docker support, comprehensive error handling, and security best practices
46
+ - **🎨 Modern UI**: Responsive design with dark mode and real-time streaming
47
 
48
+ ---
 
 
 
49
 
50
+ ## Features
51
 
52
+ ### Core Capabilities
 
 
 
 
53
 
54
+ | Feature | Description |
55
+ |---------|-------------|
56
+ | **Multi-Format Processing** | Upload and process PDF, DOCX, TXT, and image files |
57
+ | **Hybrid Search** | Combines BM25 (keyword) and FAISS (semantic) for superior retrieval |
58
+ | **Conversational AI** | Powered by Groq's Llama 3.1 for intelligent responses |
59
+ | **Memory Management** | Maintains chat history for contextual conversations |
60
+ | **Text-to-Speech** | Built-in TTS for audio playback of responses |
61
+ | **Streaming Responses** | Real-time token streaming for better UX |
62
+ | **Document Chunking** | Intelligent text splitting for optimal context windows |
63
 
64
+ ### Advanced Features
65
 
66
+ - **Semantic Embeddings**: HuggingFace `all-miniLM-L6-v2` for accurate vector representations
67
+ - **Reranking**: Contextual compression for improved relevance
68
+ - **Error Handling**: Comprehensive fallback mechanisms and error recovery
69
+ - **Security**: Non-root Docker execution and environment-based secrets
70
+ - **Scalability**: Optimized for both local and cloud deployments
 
71
 
72
+ ---
73
+
74
+ ## 🏗 Architecture
75
+
76
+ ### RAG Pipeline Overview
77
+
78
+ ```mermaid
79
+ graph TB
80
+ A[Document Upload] --> B[Document Processing]
81
+ B --> C[Text Extraction]
82
+ C --> D[Chunking Strategy]
83
+ D --> E[Embedding Generation]
84
+ E --> F[Vector Store FAISS]
85
+
86
+ G[User Query] --> H[Query Embedding]
87
+ H --> I[Hybrid Retrieval]
88
+
89
+ F --> I
90
+ J[BM25 Index] --> I
91
+
92
+ I --> K[Reranking]
93
+ K --> L[Context Assembly]
94
+ L --> M[LLM Groq Llama 3.1]
95
+ M --> N[Response Generation]
96
+ N --> O[Streaming Output]
97
+
98
+ P[Chat History] --> M
99
+ N --> P
100
+
101
+ style A fill:#e1f5ff
102
+ style G fill:#e1f5ff
103
+ style F fill:#ffe1f5
104
+ style J fill:#ffe1f5
105
+ style M fill:#f5e1ff
106
+ style O fill:#e1ffe1
107
+ ```
108
 
109
+ ### System Architecture
110
+
111
+ ```mermaid
112
+ graph LR
113
+ A[Client Browser] -->|HTTP/WebSocket| B[Flask Server]
114
+ B --> C[Document Processor]
115
+ B --> D[RAG Engine]
116
+ B --> E[TTS Service]
117
+
118
+ C --> F[(File Storage)]
119
+ D --> G[(FAISS Vector DB)]
120
+ D --> H[(BM25 Index)]
121
+ D --> I[Groq API]
122
+
123
+ J[HuggingFace Models] --> D
124
+
125
+ style B fill:#4a90e2
126
+ style D fill:#e24a90
127
+ style I fill:#90e24a
128
  ```
129
 
130
+ ### Data Flow
131
+
132
+ 1. **Document Ingestion**: Files are uploaded and validated
133
+ 2. **Processing Pipeline**: Text extraction → Chunking → Embedding
134
+ 3. **Indexing**: Dual indexing (FAISS + BM25) for hybrid search
135
+ 4. **Query Processing**: User queries are embedded and searched
136
+ 5. **Retrieval**: Top-k relevant chunks retrieved using hybrid approach
137
+ 6. **Generation**: LLM generates contextual responses with citations
138
+ 7. **Streaming**: Responses streamed back to client in real-time
139
+
140
+ ---
141
+
142
+ ## 🛠 Technology Stack
143
+
144
+ ### Backend
145
+
146
+ | Component | Technology | Purpose |
147
+ |-----------|-----------|---------|
148
+ | **Framework** | Flask 2.3+ | Web application framework |
149
+ | **RAG** | LangChain | RAG pipeline orchestration |
150
+ | **Vector DB** | FAISS | Fast similarity search |
151
+ | **Keyword Search** | BM25 | Sparse retrieval |
152
+ | **LLM** | Groq Llama 3.1 | Response generation |
153
+ | **Embeddings** | HuggingFace Transformers | Semantic embeddings |
154
+ | **Doc Processing** | Unstructured, PyPDF, python-docx | Multi-format parsing |
155
+
156
+ ### Frontend
157
+
158
+ | Component | Technology |
159
+ |-----------|-----------|
160
+ | **UI Framework** | TailwindCSS |
161
+ | **JavaScript** | Vanilla ES6+ |
162
+ | **Icons** | Font Awesome |
163
+ | **Markdown** | Marked.js |
164
+
165
+ ### Infrastructure
166
+
167
+ - **Containerization**: Docker + Docker Compose
168
+ - **Deployment**: HuggingFace Spaces, local, cloud-agnostic
169
+ - **Security**: Environment-based secrets, non-root execution
170
+
171
+ ---
172
+
173
+ ## 🚀 Quick Start
174
+
175
+ ### Prerequisites
176
+
177
+ - Python 3.9+
178
+ - Docker (optional, recommended)
179
+ - Groq API Key ([Get one here](https://console.groq.com/keys))
180
+
181
+ ### Installation Methods
182
+
183
+ #### 🐳 Method 1: Docker (Recommended)
184
 
 
185
  ```bash
186
+ # Clone the repository
187
+ git clone https://github.com/RautRitesh/Chat-with-docs
188
+ cd cognichat
189
 
190
+ # Create environment file
191
+ cp .env.example .env
192
+
193
+ # Add your Groq API key to .env
194
+ echo "GROQ_API_KEY=your_actual_api_key_here" >> .env
195
+
196
+ # Build and run with Docker Compose
197
+ docker-compose up -d
198
+
199
+ # Or build manually
200
+ docker build -t cognichat .
201
  docker run -p 7860:7860 --env-file .env cognichat
202
  ```
203
 
204
+ #### 🐍 Method 2: Local Python Environment
205
+
206
  ```bash
207
+ # Clone the repository
208
+ git clone https://github.com/RautRitesh/Chat-with-docs
209
+ cd cognichat
210
+
211
+ # Create virtual environment
212
+ python -m venv venv
213
+ source venv/bin/activate # On Windows: venv\Scripts\activate
214
+
215
  # Install dependencies
216
  pip install -r requirements.txt
217
 
218
+ # Set environment variables
219
+ export GROQ_API_KEY=your_actual_api_key_here
220
 
221
  # Run the application
222
  python app.py
223
  ```
224
 
225
+ #### 🤗 Method 3: HuggingFace Spaces
226
 
227
+ 1. Fork this repository
228
+ 2. Create a new Space on [HuggingFace](https://huggingface.co/spaces)
229
+ 3. Link your forked repository
230
+ 4. Add `GROQ_API_KEY` in Settings → Repository Secrets
231
+ 5. Space will auto-deploy!
232
 
233
+ ### First Steps
 
 
 
234
 
235
+ 1. Open `http://localhost:7860` in your browser
236
+ 2. Upload a document (PDF, DOCX, TXT, or image)
237
+ 3. Wait for processing (progress indicator will show status)
238
+ 4. Start chatting with your document!
239
+ 5. Use the 🔊 button to hear responses via TTS
240
 
241
+ ---
 
 
 
242
 
243
+ ## 📦 Deployment
244
 
245
+ ### Environment Variables
 
 
 
 
246
 
247
+ Create a `.env` file with the following variables:
 
 
 
 
248
 
249
+ ```bash
250
+ # Required
251
+ GROQ_API_KEY=your_groq_api_key_here
252
+
253
+ # Optional
254
+ PORT=7860
255
+ HF_HOME=/tmp/huggingface_cache # For HF Spaces
256
+ FLASK_DEBUG=0 # Set to 1 for development
257
+ MAX_UPLOAD_SIZE=10485760 # 10MB default
258
+ ```
259
 
260
+ ### Docker Deployment
261
 
 
262
  ```bash
263
+ # Production build
264
+ docker build -t cognichat:latest .
265
+
266
+ # Run with resource limits
267
+ docker run -d \
268
+ --name cognichat \
269
+ -p 7860:7860 \
270
+ --env-file .env \
271
+ --memory="2g" \
272
+ --cpus="1.5" \
273
+ cognichat:latest
274
+ ```
275
+
276
+ ### Docker Compose
277
+
278
+ ```yaml
279
+ version: '3.8'
280
+
281
+ services:
282
+ cognichat:
283
+ build: .
284
+ ports:
285
+ - "7860:7860"
286
+ environment:
287
+ - GROQ_API_KEY=${GROQ_API_KEY}
288
+ volumes:
289
+ - ./data:/app/data
290
+ restart: unless-stopped
291
+ ```
292
+
293
+ ### HuggingFace Spaces Configuration
294
+
295
+ Add these files to your repository:
296
+
297
+ **app_port** in `README.md` header:
298
+ ```yaml
299
+ app_port: 7860
300
+ ```
301
+
302
+ **Repository Secrets**:
303
+ - `GROQ_API_KEY`: Your Groq API key
304
+
305
+ The application automatically detects HF Spaces environment and adjusts paths accordingly.
306
+
307
+ ---
308
+
309
+ ## ⚙️ Configuration
310
+
311
+ ### Document Processing Settings
312
+
313
+ ```python
314
+ # In app.py - Customize these settings
315
+ CHUNK_SIZE = 1000 # Characters per chunk
316
+ CHUNK_OVERLAP = 200 # Overlap between chunks
317
+ EMBEDDING_MODEL = "sentence-transformers/all-miniLM-L6-v2"
318
+ RETRIEVER_K = 5 # Number of chunks to retrieve
319
+ ```
320
+
321
+ ### Model Configuration
322
+
323
+ ```python
324
+ # LLM Settings
325
+ LLM_PROVIDER = "groq"
326
+ MODEL_NAME = "llama-3.1-70b-versatile"
327
+ TEMPERATURE = 0.7
328
+ MAX_TOKENS = 2048
329
+ ```
330
+
331
+ ### Search Configuration
332
+
333
+ ```python
334
+ # Hybrid Search Weights
335
+ FAISS_WEIGHT = 0.6 # Semantic search weight
336
+ BM25_WEIGHT = 0.4 # Keyword search weight
337
+ ```
338
+
339
+ ---
340
+
341
+ ## 📚 API Reference
342
+
343
+ ### Endpoints
344
+
345
+ #### Upload Document
346
+
347
+ ```http
348
+ POST /upload
349
+ Content-Type: multipart/form-data
350
+
351
+ {
352
+ "file": <binary>
353
+ }
354
+ ```
355
+
356
+ **Response**:
357
+ ```json
358
+ {
359
+ "status": "success",
360
+ "message": "Document processed successfully",
361
+ "filename": "example.pdf",
362
+ "chunks": 45
363
+ }
364
+ ```
365
+
366
+ #### Chat
367
+
368
+ ```http
369
+ POST /chat
370
+ Content-Type: application/json
371
+
372
+ {
373
+ "message": "What is the main topic?",
374
+ "stream": true
375
+ }
376
+ ```
377
+
378
+ **Response** (Streaming):
379
+ ```
380
+ data: {"token": "The", "done": false}
381
+ data: {"token": " main", "done": false}
382
+ data: {"token": " topic", "done": false}
383
+ data: {"done": true}
384
+ ```
385
+
386
+ #### Clear Session
387
+
388
+ ```http
389
+ POST /clear
390
+ ```
391
+
392
+ **Response**:
393
+ ```json
394
+ {
395
+ "status": "success",
396
+ "message": "Session cleared"
397
+ }
398
+ ```
399
+
400
+ ---
401
+
402
+ ## 🔧 Troubleshooting
403
+
404
+ ### Common Issues
405
+
406
+ #### 1. Permission Errors in Docker
407
+
408
+ **Problem**: `Permission denied` when writing to cache directories
409
+
410
+ **Solution**:
411
+ ```bash
412
+ # Rebuild with proper permissions
413
+ docker build --no-cache -t cognichat .
414
+
415
+ # Or run with volume permissions
416
+ docker run -v $(pwd)/cache:/tmp/huggingface_cache \
417
+ --user $(id -u):$(id -g) \
418
+ cognichat
419
+ ```
420
+
421
+ #### 2. Model Loading Fails
422
+
423
+ **Problem**: Cannot download HuggingFace models
424
+
425
+ **Solution**:
426
+ ```bash
427
+ # Pre-download models
428
  python test_embeddings.py
429
 
430
+ # Or use HF_HOME environment variable
431
+ export HF_HOME=/path/to/writable/directory
432
+ ```
433
+
434
+ #### 3. Chat Returns 400 Error
435
+
436
+ **Problem**: Upload directory not writable (common in HF Spaces)
437
+
438
+ **Solution**: Application now automatically uses `/tmp/uploads` in HF Spaces environment. Ensure latest version is deployed.
439
+
440
+ #### 4. API Key Invalid
441
+
442
+ **Problem**: Groq API returns authentication error
443
+
444
+ **Solution**:
445
+ - Verify key at [Groq Console](https://console.groq.com/keys)
446
+ - Check `.env` file has correct format: `GROQ_API_KEY=gsk_...`
447
+ - Restart application after updating key
448
+
449
+ ### Debug Mode
450
+
451
+ Enable detailed logging:
452
+
453
+ ```bash
454
  export FLASK_DEBUG=1
455
+ export LANGCHAIN_VERBOSE=true
456
  python app.py
457
  ```
458
 
459
+ ---
460
+
461
+ ## 🧪 Testing
462
+
463
+ ```bash
464
+ # Run test suite
465
+ pytest tests/
466
+
467
+ # Test embedding model
468
+ python test_embeddings.py
469
+
470
+ # Test document processing
471
+ pytest tests/test_document_processor.py
472
+
473
+ # Integration tests
474
+ pytest tests/test_integration.py
475
+ ```
476
+
477
+ ---
478
+
479
+ ## 🤝 Contributing
480
+
481
+ We welcome contributions! Please follow these steps:
482
+
483
+ 1. Fork the repository
484
+ 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
485
+ 3. Commit your changes (`git commit -m 'Add amazing feature'`)
486
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
487
+ 5. Open a Pull Request
488
+
489
+ ### Development Guidelines
490
+
491
+ - Follow PEP 8 style guide
492
+ - Add tests for new features
493
+ - Update documentation
494
+ - Ensure Docker build succeeds
495
+
496
+ ---
497
+
498
+ ## 📝 Changelog
499
+
500
+ ### Version 2.0 (October 2025)
501
+
502
+ ✅ **Major Improvements**:
503
+ - Fixed Docker permission issues
504
+ - HuggingFace Spaces compatibility
505
+ - Enhanced error handling
506
+ - Multiple model loading fallbacks
507
+ - Improved security (non-root execution)
508
+
509
+ ✅ **Bug Fixes**:
510
+ - Upload directory write permissions
511
+ - Cache directory access
512
+ - Model initialization reliability
513
+
514
+ ### Version 1.0 (Initial Release)
515
+
516
+ - Basic RAG functionality
517
+ - PDF and DOCX support
518
+ - FAISS vector store
519
+ - Conversational memory
520
+
521
+ ---
522
+
523
+ ## 📄 License
524
+
525
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
526
+
527
+ ---
528
+
529
+ ## 🙏 Acknowledgments
530
+
531
+ - **LangChain** for RAG framework
532
+ - **Groq** for high-speed LLM inference
533
+ - **HuggingFace** for embeddings and hosting
534
+ - **FAISS** for efficient vector search
535
+
536
+ ---
537
+
538
+ ## 📞 Support
539
+
540
+ - **Issues**: [GitHub Issues](https://github.com/yourusername/cognichat/issues)
541
+ - **Discussions**: [GitHub Discussions](https://github.com/yourusername/cognichat/discussions)
542
+ - **Email**: riteshraut123321@gmail.com
543
+
544
+ ---
545
+
546
+ <div align="center">
547
 
548
+ **Made with ❤️ by the CogniChat Team**
 
 
549
 
550
+ </div>
app.py CHANGED
@@ -6,14 +6,13 @@ import uuid
6
  from flask import Flask, request, render_template, session, jsonify, Response, stream_with_context
7
  from werkzeug.utils import secure_filename
8
  from rag_processor import create_rag_chain
 
9
 
10
- # ============================ ADDITIONS START ============================
11
  from gtts import gTTS
12
  import io
13
- import re # <-- Import the regular expression module
14
- # ============================ ADDITIONS END ==============================
15
 
16
- # Document Loaders
17
  from langchain_community.document_loaders import (
18
  TextLoader,
19
  PyPDFLoader,
@@ -22,28 +21,56 @@ from langchain_community.document_loaders import (
22
 
23
  # Additional imports for robust PDF handling
24
  from langchain_core.documents import Document
25
- import fitz # PyMuPDF for alternative PDF processing
26
 
27
  # Text Splitter, Embeddings, Retrievers
28
  from langchain.text_splitter import RecursiveCharacterTextSplitter
29
- from langchain_community.embeddings import HuggingFaceEmbeddings
30
  from langchain_community.vectorstores import FAISS
31
- from langchain.retrievers import EnsembleRetriever
 
32
  from langchain_community.retrievers import BM25Retriever
33
  from langchain_community.chat_message_histories import ChatMessageHistory
 
 
 
34
 
35
- # --- Basic Flask App Setup ---
36
  app = Flask(__name__)
37
  app.config['SECRET_KEY'] = os.urandom(24)
38
 
39
- # Use /tmp directory for uploads in HF Spaces (writable), fallback to local uploads for development
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  is_hf_spaces = bool(os.getenv("SPACE_ID") or os.getenv("SPACES_ZERO_GPU"))
41
  if is_hf_spaces:
42
  app.config['UPLOAD_FOLDER'] = '/tmp/uploads'
43
  else:
44
  app.config['UPLOAD_FOLDER'] = 'uploads'
45
 
46
- # Create upload directory with proper error handling
47
  try:
48
  os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
49
  print(f"✓ Upload folder ready: {app.config['UPLOAD_FOLDER']}")
@@ -54,21 +81,23 @@ except Exception as e:
54
  os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
55
  print(f"✓ Using fallback upload folder: {app.config['UPLOAD_FOLDER']}")
56
 
57
- # --- In-memory Storage & Global Model Loading ---
 
 
 
58
  rag_chains = {}
59
  message_histories = {}
60
 
61
- # Load the embedding model once when the application starts for efficiency.
62
  print("Loading embedding model...")
63
 
64
- # Set environment variables for HuggingFace cache (use home directory if available)
65
  cache_base = os.path.expanduser("~/.cache") if os.path.expanduser("~") != "~" else "/tmp/hf_cache"
66
  os.environ.setdefault('HF_HOME', f'{cache_base}/huggingface')
67
  os.environ.setdefault('HF_HUB_CACHE', f'{cache_base}/huggingface/hub')
68
  os.environ.setdefault('TRANSFORMERS_CACHE', f'{cache_base}/transformers')
69
  os.environ.setdefault('SENTENCE_TRANSFORMERS_HOME', f'{cache_base}/sentence_transformers')
70
 
71
- # Create cache directories with proper permissions
72
  cache_dirs = [
73
  os.environ['HF_HOME'],
74
  os.environ['HF_HUB_CACHE'],
@@ -103,6 +132,8 @@ for cache_dir in cache_dirs:
103
  except Exception as e:
104
  print(f"Warning: Could not create {cache_dir}: {e}")
105
 
 
 
106
  # Try loading embedding model with error handling and fallbacks
107
  try:
108
  print("Attempting to load embedding model...")
@@ -135,6 +166,13 @@ except Exception as e:
135
  print(f"Final attempt failed: {e3}")
136
  # Use a simpler fallback model or raise the error
137
  raise Exception(f"Could not load any embedding model. Last error: {e3}")
 
 
 
 
 
 
 
138
 
139
  def load_pdf_with_fallback(filepath):
140
  """
@@ -336,12 +374,19 @@ def upload_files():
336
  retrievers=[bm25_retriever, faiss_retriever],
337
  weights=[0.5, 0.5]
338
  )
 
 
 
 
 
 
 
339
 
340
  session_id = str(uuid.uuid4())
341
  print(f"Creating RAG chain for session {session_id}...")
342
 
343
  try:
344
- rag_chain = create_rag_chain(ensemble_retriever, get_session_history)
345
  rag_chains[session_id] = rag_chain
346
  print(f"✓ RAG chain created successfully for session {session_id} with {len(processed_files)} documents.")
347
  except Exception as rag_error:
@@ -443,7 +488,6 @@ def chat():
443
  print(f"Error during chat invocation: {e}")
444
  return Response("An error occurred while getting the answer.", status=500, mimetype='text/plain')
445
 
446
- # ============================ ADDITIONS START ============================
447
 
448
  def clean_markdown_for_tts(text: str) -> str:
449
  """Removes markdown formatting for cleaner text-to-speech output."""
@@ -484,7 +528,7 @@ def text_to_speech():
484
  except Exception as e:
485
  print(f"Error in TTS generation: {e}")
486
  return jsonify({'status': 'error', 'message': 'Failed to generate audio.'}), 500
487
- # ============================ ADDITIONS END ==============================
488
 
489
 
490
  @app.route('/debug', methods=['GET'])
 
6
  from flask import Flask, request, render_template, session, jsonify, Response, stream_with_context
7
  from werkzeug.utils import secure_filename
8
  from rag_processor import create_rag_chain
9
+ from typing import Sequence, Any
10
 
 
11
  from gtts import gTTS
12
  import io
13
+ import re
14
+
15
 
 
16
  from langchain_community.document_loaders import (
17
  TextLoader,
18
  PyPDFLoader,
 
21
 
22
  # Additional imports for robust PDF handling
23
  from langchain_core.documents import Document
24
+ import fitz
25
 
26
  # Text Splitter, Embeddings, Retrievers
27
  from langchain.text_splitter import RecursiveCharacterTextSplitter
28
+ from langchain_huggingface import HuggingFaceEmbeddings
29
  from langchain_community.vectorstores import FAISS
30
+ from langchain.retrievers import EnsembleRetriever, ContextualCompressionRetriever
31
+ from langchain.retrievers.document_compressors.base import BaseDocumentCompressor
32
  from langchain_community.retrievers import BM25Retriever
33
  from langchain_community.chat_message_histories import ChatMessageHistory
34
+ from sentence_transformers.cross_encoder import CrossEncoder
35
+ import numpy as np
36
+
37
 
 
38
  app = Flask(__name__)
39
  app.config['SECRET_KEY'] = os.urandom(24)
40
 
41
+
42
+ class LocalReranker(BaseDocumentCompressor):
43
+ model: Any
44
+ top_n: int = 5
45
+
46
+ class Config:
47
+ arbitrary_types_allowed = True
48
+
49
+ def compress_documents(
50
+ self,
51
+ documents: Sequence[Document],
52
+ query: str,
53
+ callbacks=None,
54
+ ) -> Sequence[Document]:
55
+ if not documents:
56
+ return []
57
+
58
+ pairs = [[query, doc.page_content] for doc in documents]
59
+ scores = self.model.predict(pairs, show_progress_bar=False)
60
+
61
+ doc_scores = list(zip(documents, scores))
62
+ sorted_doc_scores = sorted(doc_scores, key=lambda x: x[1], reverse=True)
63
+
64
+ return [doc for doc, score in sorted_doc_scores[:self.top_n]]
65
+
66
+
67
  is_hf_spaces = bool(os.getenv("SPACE_ID") or os.getenv("SPACES_ZERO_GPU"))
68
  if is_hf_spaces:
69
  app.config['UPLOAD_FOLDER'] = '/tmp/uploads'
70
  else:
71
  app.config['UPLOAD_FOLDER'] = 'uploads'
72
 
73
+
74
  try:
75
  os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
76
  print(f"✓ Upload folder ready: {app.config['UPLOAD_FOLDER']}")
 
81
  os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
82
  print(f"✓ Using fallback upload folder: {app.config['UPLOAD_FOLDER']}")
83
 
84
+
85
+
86
+
87
+
88
  rag_chains = {}
89
  message_histories = {}
90
 
 
91
  print("Loading embedding model...")
92
 
93
+
94
  cache_base = os.path.expanduser("~/.cache") if os.path.expanduser("~") != "~" else "/tmp/hf_cache"
95
  os.environ.setdefault('HF_HOME', f'{cache_base}/huggingface')
96
  os.environ.setdefault('HF_HUB_CACHE', f'{cache_base}/huggingface/hub')
97
  os.environ.setdefault('TRANSFORMERS_CACHE', f'{cache_base}/transformers')
98
  os.environ.setdefault('SENTENCE_TRANSFORMERS_HOME', f'{cache_base}/sentence_transformers')
99
 
100
+
101
  cache_dirs = [
102
  os.environ['HF_HOME'],
103
  os.environ['HF_HUB_CACHE'],
 
132
  except Exception as e:
133
  print(f"Warning: Could not create {cache_dir}: {e}")
134
 
135
+
136
+
137
  # Try loading embedding model with error handling and fallbacks
138
  try:
139
  print("Attempting to load embedding model...")
 
166
  print(f"Final attempt failed: {e3}")
167
  # Use a simpler fallback model or raise the error
168
  raise Exception(f"Could not load any embedding model. Last error: {e3}")
169
+
170
+
171
+
172
+ print("Loading local re-ranking model...")
173
+ RERANKER_MODEL = CrossEncoder("mixedbread-ai/mxbai-rerank-xsmall-v1", device='cpu')
174
+ print("Re-ranking model loaded successfully.")
175
+
176
 
177
  def load_pdf_with_fallback(filepath):
178
  """
 
374
  retrievers=[bm25_retriever, faiss_retriever],
375
  weights=[0.5, 0.5]
376
  )
377
+ reranker = LocalReranker(model=RERANKER_MODEL, top_n=3)
378
+
379
+ compression_retriever = ContextualCompressionRetriever(
380
+ base_compressor=reranker,
381
+ base_retriever=ensemble_retriever
382
+ )
383
+
384
 
385
  session_id = str(uuid.uuid4())
386
  print(f"Creating RAG chain for session {session_id}...")
387
 
388
  try:
389
+ rag_chain = create_rag_chain(compression_retriever, get_session_history)
390
  rag_chains[session_id] = rag_chain
391
  print(f"✓ RAG chain created successfully for session {session_id} with {len(processed_files)} documents.")
392
  except Exception as rag_error:
 
488
  print(f"Error during chat invocation: {e}")
489
  return Response("An error occurred while getting the answer.", status=500, mimetype='text/plain')
490
 
 
491
 
492
  def clean_markdown_for_tts(text: str) -> str:
493
  """Removes markdown formatting for cleaner text-to-speech output."""
 
528
  except Exception as e:
529
  print(f"Error in TTS generation: {e}")
530
  return jsonify({'status': 'error', 'message': 'Failed to generate audio.'}), 500
531
+
532
 
533
 
534
  @app.route('/debug', methods=['GET'])
rag_processor.py CHANGED
@@ -83,6 +83,7 @@ Standalone Question:"""
83
  rag_template = """You are an expert assistant named `Cognichat`.Whenver user ask you about who you are , simply say you are `Cognichat`.
84
  You are developed by Ritesh and Alish.
85
  Your job is to provide accurate and helpful answers based ONLY on the provided context.
 
86
  If the information is not in the context, clearly state that you don't know the answer.
87
  Provide a clear and concise answer.
88
 
 
83
  rag_template = """You are an expert assistant named `Cognichat`.Whenver user ask you about who you are , simply say you are `Cognichat`.
84
  You are developed by Ritesh and Alish.
85
  Your job is to provide accurate and helpful answers based ONLY on the provided context.
86
+ Whatever the user ask,it is always about the document so based on the document only provide the answer.
87
  If the information is not in the context, clearly state that you don't know the answer.
88
  Provide a clear and concise answer.
89