Spaces:
Sleeping
Sleeping
riteshraut commited on
Commit ·
edea2d6
1
Parent(s): f3c5275
feat/included the re ranker
Browse files- README.md +499 -91
- app.py +61 -17
- rag_processor.py +1 -0
README.md
CHANGED
|
@@ -1,142 +1,550 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
-
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
-
- **Multi-format Document Support**: Upload PDF, TXT, DOCX, and image files
|
| 19 |
-
- **Advanced RAG Pipeline**: Hybrid search with BM25 and FAISS retrievers
|
| 20 |
-
- **Conversational Memory**: Maintains chat history for contextual conversations
|
| 21 |
-
- **Text-to-Speech**: Listen to AI responses with built-in TTS
|
| 22 |
-
- **Streaming Responses**: Real-time response generation
|
| 23 |
-
- **Modern UI**: Clean, responsive interface with dark mode
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
-
2. Wait for processing (may take a few minutes for large documents)
|
| 29 |
-
3. Start chatting with your documents!
|
| 30 |
-
4. Use the play button to listen to responses
|
| 31 |
|
| 32 |
-
##
|
| 33 |
|
| 34 |
-
|
| 35 |
-
- **AI Models**: Groq API with Llama 3.1
|
| 36 |
-
- **Embeddings**: HuggingFace all-miniLM-L6-v2
|
| 37 |
-
- **Frontend**: Vanilla JavaScript, TailwindCSS
|
| 38 |
-
- **Document Processing**: Unstructured, PyPDF, python-docx
|
| 39 |
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
-
###
|
| 43 |
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
3. Restart your Space after adding the secret
|
| 50 |
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
```
|
| 60 |
|
| 61 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
-
### 2. Run with Docker (Recommended)
|
| 64 |
```bash
|
| 65 |
-
#
|
| 66 |
-
|
|
|
|
| 67 |
|
| 68 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
docker run -p 7860:7860 --env-file .env cognichat
|
| 70 |
```
|
| 71 |
|
| 72 |
-
|
|
|
|
| 73 |
```bash
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
# Install dependencies
|
| 75 |
pip install -r requirements.txt
|
| 76 |
|
| 77 |
-
# Set
|
| 78 |
-
export GROQ_API_KEY=
|
| 79 |
|
| 80 |
# Run the application
|
| 81 |
python app.py
|
| 82 |
```
|
| 83 |
|
| 84 |
-
|
| 85 |
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
-
|
| 89 |
-
- Resolved cache directory permission problems
|
| 90 |
-
- Application now runs as non-root user for security
|
| 91 |
-
- Improved error handling and fallback mechanisms
|
| 92 |
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
|
| 99 |
-
|
| 100 |
-
- Multiple fallback strategies for embedding model initialization
|
| 101 |
-
- Better cache management for HuggingFace models
|
| 102 |
-
- Improved startup reliability
|
| 103 |
|
| 104 |
-
##
|
| 105 |
|
| 106 |
-
###
|
| 107 |
-
If you encounter permission errors, ensure:
|
| 108 |
-
1. Docker containers run with proper user permissions
|
| 109 |
-
2. Cache directories are writable
|
| 110 |
-
3. Environment variables are set correctly
|
| 111 |
|
| 112 |
-
|
| 113 |
-
The app includes multiple fallback mechanisms:
|
| 114 |
-
1. Primary: `sentence-transformers/all-miniLM-L6-v2`
|
| 115 |
-
2. Fallback: `all-miniLM-L6-v2`
|
| 116 |
-
3. Final fallback: Default model without cache specification
|
| 117 |
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 123 |
|
| 124 |
-
|
| 125 |
|
| 126 |
-
For development and testing:
|
| 127 |
```bash
|
| 128 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
python test_embeddings.py
|
| 130 |
|
| 131 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
export FLASK_DEBUG=1
|
|
|
|
| 133 |
python app.py
|
| 134 |
```
|
| 135 |
|
| 136 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 137 |
|
| 138 |
-
|
| 139 |
-
- `HF_HOME`: HuggingFace cache directory
|
| 140 |
-
- `PORT`: Application port (default: 7860)
|
| 141 |
|
| 142 |
-
|
|
|
|
| 1 |
+
# 🤖 CogniChat - Intelligent Document Chat System
|
| 2 |
+
|
| 3 |
+
<div align="center">
|
| 4 |
+
|
| 5 |
+

|
| 6 |
+

|
| 7 |
+

|
| 8 |
+

|
| 9 |
+
|
| 10 |
+
**Transform your documents into interactive conversations powered by advanced RAG technology**
|
| 11 |
+
|
| 12 |
+
[Features](#-features) • [Quick Start](#-quick-start) • [Architecture](#-architecture) • [Deployment](#-deployment) • [API](#-api-reference)
|
| 13 |
+
|
| 14 |
+
</div>
|
| 15 |
+
|
| 16 |
---
|
| 17 |
+
|
| 18 |
+
## 📋 Table of Contents
|
| 19 |
+
|
| 20 |
+
- [Overview](#-overview)
|
| 21 |
+
- [Features](#-features)
|
| 22 |
+
- [Architecture](#-architecture)
|
| 23 |
+
- [Technology Stack](#-technology-stack)
|
| 24 |
+
- [Quick Start](#-quick-start)
|
| 25 |
+
- [Deployment](#-deployment)
|
| 26 |
+
- [Configuration](#-configuration)
|
| 27 |
+
- [API Reference](#-api-reference)
|
| 28 |
+
- [Troubleshooting](#-troubleshooting)
|
| 29 |
+
- [Contributing](#-contributing)
|
| 30 |
+
- [License](#-license)
|
| 31 |
+
|
| 32 |
---
|
| 33 |
|
| 34 |
+
## 🎯 Overview
|
| 35 |
|
| 36 |
+
CogniChat is a production-ready, intelligent document chat application that leverages **Retrieval Augmented Generation (RAG)** to enable natural conversations with your documents. Built with enterprise-grade technologies, it provides accurate, context-aware responses from your document corpus.
|
| 37 |
|
| 38 |
+
### Why CogniChat?
|
| 39 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
+
- **🔉 Audio Overview of Your document**:Simply ask the question and listen the audio. Now your document can speak with you.
|
| 42 |
+
- **🎯 Accurate Retrieval**: Hybrid search combining BM25 and FAISS for optimal results
|
| 43 |
+
- **💬 Conversational Memory**: Maintains context across multiple interactions
|
| 44 |
+
- **📄 Multi-Format Support**: Handles PDF, DOCX, TXT, and image files
|
| 45 |
+
- **🚀 Production Ready**: Docker support, comprehensive error handling, and security best practices
|
| 46 |
+
- **🎨 Modern UI**: Responsive design with dark mode and real-time streaming
|
| 47 |
|
| 48 |
+
---
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
+
## ✨ Features
|
| 51 |
|
| 52 |
+
### Core Capabilities
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
+
| Feature | Description |
|
| 55 |
+
|---------|-------------|
|
| 56 |
+
| **Multi-Format Processing** | Upload and process PDF, DOCX, TXT, and image files |
|
| 57 |
+
| **Hybrid Search** | Combines BM25 (keyword) and FAISS (semantic) for superior retrieval |
|
| 58 |
+
| **Conversational AI** | Powered by Groq's Llama 3.1 for intelligent responses |
|
| 59 |
+
| **Memory Management** | Maintains chat history for contextual conversations |
|
| 60 |
+
| **Text-to-Speech** | Built-in TTS for audio playback of responses |
|
| 61 |
+
| **Streaming Responses** | Real-time token streaming for better UX |
|
| 62 |
+
| **Document Chunking** | Intelligent text splitting for optimal context windows |
|
| 63 |
|
| 64 |
+
### Advanced Features
|
| 65 |
|
| 66 |
+
- **Semantic Embeddings**: HuggingFace `all-miniLM-L6-v2` for accurate vector representations
|
| 67 |
+
- **Reranking**: Contextual compression for improved relevance
|
| 68 |
+
- **Error Handling**: Comprehensive fallback mechanisms and error recovery
|
| 69 |
+
- **Security**: Non-root Docker execution and environment-based secrets
|
| 70 |
+
- **Scalability**: Optimized for both local and cloud deployments
|
|
|
|
| 71 |
|
| 72 |
+
---
|
| 73 |
+
|
| 74 |
+
## 🏗 Architecture
|
| 75 |
+
|
| 76 |
+
### RAG Pipeline Overview
|
| 77 |
+
|
| 78 |
+
```mermaid
|
| 79 |
+
graph TB
|
| 80 |
+
A[Document Upload] --> B[Document Processing]
|
| 81 |
+
B --> C[Text Extraction]
|
| 82 |
+
C --> D[Chunking Strategy]
|
| 83 |
+
D --> E[Embedding Generation]
|
| 84 |
+
E --> F[Vector Store FAISS]
|
| 85 |
+
|
| 86 |
+
G[User Query] --> H[Query Embedding]
|
| 87 |
+
H --> I[Hybrid Retrieval]
|
| 88 |
+
|
| 89 |
+
F --> I
|
| 90 |
+
J[BM25 Index] --> I
|
| 91 |
+
|
| 92 |
+
I --> K[Reranking]
|
| 93 |
+
K --> L[Context Assembly]
|
| 94 |
+
L --> M[LLM Groq Llama 3.1]
|
| 95 |
+
M --> N[Response Generation]
|
| 96 |
+
N --> O[Streaming Output]
|
| 97 |
+
|
| 98 |
+
P[Chat History] --> M
|
| 99 |
+
N --> P
|
| 100 |
+
|
| 101 |
+
style A fill:#e1f5ff
|
| 102 |
+
style G fill:#e1f5ff
|
| 103 |
+
style F fill:#ffe1f5
|
| 104 |
+
style J fill:#ffe1f5
|
| 105 |
+
style M fill:#f5e1ff
|
| 106 |
+
style O fill:#e1ffe1
|
| 107 |
+
```
|
| 108 |
|
| 109 |
+
### System Architecture
|
| 110 |
+
|
| 111 |
+
```mermaid
|
| 112 |
+
graph LR
|
| 113 |
+
A[Client Browser] -->|HTTP/WebSocket| B[Flask Server]
|
| 114 |
+
B --> C[Document Processor]
|
| 115 |
+
B --> D[RAG Engine]
|
| 116 |
+
B --> E[TTS Service]
|
| 117 |
+
|
| 118 |
+
C --> F[(File Storage)]
|
| 119 |
+
D --> G[(FAISS Vector DB)]
|
| 120 |
+
D --> H[(BM25 Index)]
|
| 121 |
+
D --> I[Groq API]
|
| 122 |
+
|
| 123 |
+
J[HuggingFace Models] --> D
|
| 124 |
+
|
| 125 |
+
style B fill:#4a90e2
|
| 126 |
+
style D fill:#e24a90
|
| 127 |
+
style I fill:#90e24a
|
| 128 |
```
|
| 129 |
|
| 130 |
+
### Data Flow
|
| 131 |
+
|
| 132 |
+
1. **Document Ingestion**: Files are uploaded and validated
|
| 133 |
+
2. **Processing Pipeline**: Text extraction → Chunking → Embedding
|
| 134 |
+
3. **Indexing**: Dual indexing (FAISS + BM25) for hybrid search
|
| 135 |
+
4. **Query Processing**: User queries are embedded and searched
|
| 136 |
+
5. **Retrieval**: Top-k relevant chunks retrieved using hybrid approach
|
| 137 |
+
6. **Generation**: LLM generates contextual responses with citations
|
| 138 |
+
7. **Streaming**: Responses streamed back to client in real-time
|
| 139 |
+
|
| 140 |
+
---
|
| 141 |
+
|
| 142 |
+
## 🛠 Technology Stack
|
| 143 |
+
|
| 144 |
+
### Backend
|
| 145 |
+
|
| 146 |
+
| Component | Technology | Purpose |
|
| 147 |
+
|-----------|-----------|---------|
|
| 148 |
+
| **Framework** | Flask 2.3+ | Web application framework |
|
| 149 |
+
| **RAG** | LangChain | RAG pipeline orchestration |
|
| 150 |
+
| **Vector DB** | FAISS | Fast similarity search |
|
| 151 |
+
| **Keyword Search** | BM25 | Sparse retrieval |
|
| 152 |
+
| **LLM** | Groq Llama 3.1 | Response generation |
|
| 153 |
+
| **Embeddings** | HuggingFace Transformers | Semantic embeddings |
|
| 154 |
+
| **Doc Processing** | Unstructured, PyPDF, python-docx | Multi-format parsing |
|
| 155 |
+
|
| 156 |
+
### Frontend
|
| 157 |
+
|
| 158 |
+
| Component | Technology |
|
| 159 |
+
|-----------|-----------|
|
| 160 |
+
| **UI Framework** | TailwindCSS |
|
| 161 |
+
| **JavaScript** | Vanilla ES6+ |
|
| 162 |
+
| **Icons** | Font Awesome |
|
| 163 |
+
| **Markdown** | Marked.js |
|
| 164 |
+
|
| 165 |
+
### Infrastructure
|
| 166 |
+
|
| 167 |
+
- **Containerization**: Docker + Docker Compose
|
| 168 |
+
- **Deployment**: HuggingFace Spaces, local, cloud-agnostic
|
| 169 |
+
- **Security**: Environment-based secrets, non-root execution
|
| 170 |
+
|
| 171 |
+
---
|
| 172 |
+
|
| 173 |
+
## 🚀 Quick Start
|
| 174 |
+
|
| 175 |
+
### Prerequisites
|
| 176 |
+
|
| 177 |
+
- Python 3.9+
|
| 178 |
+
- Docker (optional, recommended)
|
| 179 |
+
- Groq API Key ([Get one here](https://console.groq.com/keys))
|
| 180 |
+
|
| 181 |
+
### Installation Methods
|
| 182 |
+
|
| 183 |
+
#### 🐳 Method 1: Docker (Recommended)
|
| 184 |
|
|
|
|
| 185 |
```bash
|
| 186 |
+
# Clone the repository
|
| 187 |
+
git clone https://github.com/RautRitesh/Chat-with-docs
|
| 188 |
+
cd cognichat
|
| 189 |
|
| 190 |
+
# Create environment file
|
| 191 |
+
cp .env.example .env
|
| 192 |
+
|
| 193 |
+
# Add your Groq API key to .env
|
| 194 |
+
echo "GROQ_API_KEY=your_actual_api_key_here" >> .env
|
| 195 |
+
|
| 196 |
+
# Build and run with Docker Compose
|
| 197 |
+
docker-compose up -d
|
| 198 |
+
|
| 199 |
+
# Or build manually
|
| 200 |
+
docker build -t cognichat .
|
| 201 |
docker run -p 7860:7860 --env-file .env cognichat
|
| 202 |
```
|
| 203 |
|
| 204 |
+
#### 🐍 Method 2: Local Python Environment
|
| 205 |
+
|
| 206 |
```bash
|
| 207 |
+
# Clone the repository
|
| 208 |
+
git clone https://github.com/RautRitesh/Chat-with-docs
|
| 209 |
+
cd cognichat
|
| 210 |
+
|
| 211 |
+
# Create virtual environment
|
| 212 |
+
python -m venv venv
|
| 213 |
+
source venv/bin/activate # On Windows: venv\Scripts\activate
|
| 214 |
+
|
| 215 |
# Install dependencies
|
| 216 |
pip install -r requirements.txt
|
| 217 |
|
| 218 |
+
# Set environment variables
|
| 219 |
+
export GROQ_API_KEY=your_actual_api_key_here
|
| 220 |
|
| 221 |
# Run the application
|
| 222 |
python app.py
|
| 223 |
```
|
| 224 |
|
| 225 |
+
#### 🤗 Method 3: HuggingFace Spaces
|
| 226 |
|
| 227 |
+
1. Fork this repository
|
| 228 |
+
2. Create a new Space on [HuggingFace](https://huggingface.co/spaces)
|
| 229 |
+
3. Link your forked repository
|
| 230 |
+
4. Add `GROQ_API_KEY` in Settings → Repository Secrets
|
| 231 |
+
5. Space will auto-deploy!
|
| 232 |
|
| 233 |
+
### First Steps
|
|
|
|
|
|
|
|
|
|
| 234 |
|
| 235 |
+
1. Open `http://localhost:7860` in your browser
|
| 236 |
+
2. Upload a document (PDF, DOCX, TXT, or image)
|
| 237 |
+
3. Wait for processing (progress indicator will show status)
|
| 238 |
+
4. Start chatting with your document!
|
| 239 |
+
5. Use the 🔊 button to hear responses via TTS
|
| 240 |
|
| 241 |
+
---
|
|
|
|
|
|
|
|
|
|
| 242 |
|
| 243 |
+
## 📦 Deployment
|
| 244 |
|
| 245 |
+
### Environment Variables
|
|
|
|
|
|
|
|
|
|
|
|
|
| 246 |
|
| 247 |
+
Create a `.env` file with the following variables:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 248 |
|
| 249 |
+
```bash
|
| 250 |
+
# Required
|
| 251 |
+
GROQ_API_KEY=your_groq_api_key_here
|
| 252 |
+
|
| 253 |
+
# Optional
|
| 254 |
+
PORT=7860
|
| 255 |
+
HF_HOME=/tmp/huggingface_cache # For HF Spaces
|
| 256 |
+
FLASK_DEBUG=0 # Set to 1 for development
|
| 257 |
+
MAX_UPLOAD_SIZE=10485760 # 10MB default
|
| 258 |
+
```
|
| 259 |
|
| 260 |
+
### Docker Deployment
|
| 261 |
|
|
|
|
| 262 |
```bash
|
| 263 |
+
# Production build
|
| 264 |
+
docker build -t cognichat:latest .
|
| 265 |
+
|
| 266 |
+
# Run with resource limits
|
| 267 |
+
docker run -d \
|
| 268 |
+
--name cognichat \
|
| 269 |
+
-p 7860:7860 \
|
| 270 |
+
--env-file .env \
|
| 271 |
+
--memory="2g" \
|
| 272 |
+
--cpus="1.5" \
|
| 273 |
+
cognichat:latest
|
| 274 |
+
```
|
| 275 |
+
|
| 276 |
+
### Docker Compose
|
| 277 |
+
|
| 278 |
+
```yaml
|
| 279 |
+
version: '3.8'
|
| 280 |
+
|
| 281 |
+
services:
|
| 282 |
+
cognichat:
|
| 283 |
+
build: .
|
| 284 |
+
ports:
|
| 285 |
+
- "7860:7860"
|
| 286 |
+
environment:
|
| 287 |
+
- GROQ_API_KEY=${GROQ_API_KEY}
|
| 288 |
+
volumes:
|
| 289 |
+
- ./data:/app/data
|
| 290 |
+
restart: unless-stopped
|
| 291 |
+
```
|
| 292 |
+
|
| 293 |
+
### HuggingFace Spaces Configuration
|
| 294 |
+
|
| 295 |
+
Add these files to your repository:
|
| 296 |
+
|
| 297 |
+
**app_port** in `README.md` header:
|
| 298 |
+
```yaml
|
| 299 |
+
app_port: 7860
|
| 300 |
+
```
|
| 301 |
+
|
| 302 |
+
**Repository Secrets**:
|
| 303 |
+
- `GROQ_API_KEY`: Your Groq API key
|
| 304 |
+
|
| 305 |
+
The application automatically detects HF Spaces environment and adjusts paths accordingly.
|
| 306 |
+
|
| 307 |
+
---
|
| 308 |
+
|
| 309 |
+
## ⚙️ Configuration
|
| 310 |
+
|
| 311 |
+
### Document Processing Settings
|
| 312 |
+
|
| 313 |
+
```python
|
| 314 |
+
# In app.py - Customize these settings
|
| 315 |
+
CHUNK_SIZE = 1000 # Characters per chunk
|
| 316 |
+
CHUNK_OVERLAP = 200 # Overlap between chunks
|
| 317 |
+
EMBEDDING_MODEL = "sentence-transformers/all-miniLM-L6-v2"
|
| 318 |
+
RETRIEVER_K = 5 # Number of chunks to retrieve
|
| 319 |
+
```
|
| 320 |
+
|
| 321 |
+
### Model Configuration
|
| 322 |
+
|
| 323 |
+
```python
|
| 324 |
+
# LLM Settings
|
| 325 |
+
LLM_PROVIDER = "groq"
|
| 326 |
+
MODEL_NAME = "llama-3.1-70b-versatile"
|
| 327 |
+
TEMPERATURE = 0.7
|
| 328 |
+
MAX_TOKENS = 2048
|
| 329 |
+
```
|
| 330 |
+
|
| 331 |
+
### Search Configuration
|
| 332 |
+
|
| 333 |
+
```python
|
| 334 |
+
# Hybrid Search Weights
|
| 335 |
+
FAISS_WEIGHT = 0.6 # Semantic search weight
|
| 336 |
+
BM25_WEIGHT = 0.4 # Keyword search weight
|
| 337 |
+
```
|
| 338 |
+
|
| 339 |
+
---
|
| 340 |
+
|
| 341 |
+
## 📚 API Reference
|
| 342 |
+
|
| 343 |
+
### Endpoints
|
| 344 |
+
|
| 345 |
+
#### Upload Document
|
| 346 |
+
|
| 347 |
+
```http
|
| 348 |
+
POST /upload
|
| 349 |
+
Content-Type: multipart/form-data
|
| 350 |
+
|
| 351 |
+
{
|
| 352 |
+
"file": <binary>
|
| 353 |
+
}
|
| 354 |
+
```
|
| 355 |
+
|
| 356 |
+
**Response**:
|
| 357 |
+
```json
|
| 358 |
+
{
|
| 359 |
+
"status": "success",
|
| 360 |
+
"message": "Document processed successfully",
|
| 361 |
+
"filename": "example.pdf",
|
| 362 |
+
"chunks": 45
|
| 363 |
+
}
|
| 364 |
+
```
|
| 365 |
+
|
| 366 |
+
#### Chat
|
| 367 |
+
|
| 368 |
+
```http
|
| 369 |
+
POST /chat
|
| 370 |
+
Content-Type: application/json
|
| 371 |
+
|
| 372 |
+
{
|
| 373 |
+
"message": "What is the main topic?",
|
| 374 |
+
"stream": true
|
| 375 |
+
}
|
| 376 |
+
```
|
| 377 |
+
|
| 378 |
+
**Response** (Streaming):
|
| 379 |
+
```
|
| 380 |
+
data: {"token": "The", "done": false}
|
| 381 |
+
data: {"token": " main", "done": false}
|
| 382 |
+
data: {"token": " topic", "done": false}
|
| 383 |
+
data: {"done": true}
|
| 384 |
+
```
|
| 385 |
+
|
| 386 |
+
#### Clear Session
|
| 387 |
+
|
| 388 |
+
```http
|
| 389 |
+
POST /clear
|
| 390 |
+
```
|
| 391 |
+
|
| 392 |
+
**Response**:
|
| 393 |
+
```json
|
| 394 |
+
{
|
| 395 |
+
"status": "success",
|
| 396 |
+
"message": "Session cleared"
|
| 397 |
+
}
|
| 398 |
+
```
|
| 399 |
+
|
| 400 |
+
---
|
| 401 |
+
|
| 402 |
+
## 🔧 Troubleshooting
|
| 403 |
+
|
| 404 |
+
### Common Issues
|
| 405 |
+
|
| 406 |
+
#### 1. Permission Errors in Docker
|
| 407 |
+
|
| 408 |
+
**Problem**: `Permission denied` when writing to cache directories
|
| 409 |
+
|
| 410 |
+
**Solution**:
|
| 411 |
+
```bash
|
| 412 |
+
# Rebuild with proper permissions
|
| 413 |
+
docker build --no-cache -t cognichat .
|
| 414 |
+
|
| 415 |
+
# Or run with volume permissions
|
| 416 |
+
docker run -v $(pwd)/cache:/tmp/huggingface_cache \
|
| 417 |
+
--user $(id -u):$(id -g) \
|
| 418 |
+
cognichat
|
| 419 |
+
```
|
| 420 |
+
|
| 421 |
+
#### 2. Model Loading Fails
|
| 422 |
+
|
| 423 |
+
**Problem**: Cannot download HuggingFace models
|
| 424 |
+
|
| 425 |
+
**Solution**:
|
| 426 |
+
```bash
|
| 427 |
+
# Pre-download models
|
| 428 |
python test_embeddings.py
|
| 429 |
|
| 430 |
+
# Or use HF_HOME environment variable
|
| 431 |
+
export HF_HOME=/path/to/writable/directory
|
| 432 |
+
```
|
| 433 |
+
|
| 434 |
+
#### 3. Chat Returns 400 Error
|
| 435 |
+
|
| 436 |
+
**Problem**: Upload directory not writable (common in HF Spaces)
|
| 437 |
+
|
| 438 |
+
**Solution**: Application now automatically uses `/tmp/uploads` in HF Spaces environment. Ensure latest version is deployed.
|
| 439 |
+
|
| 440 |
+
#### 4. API Key Invalid
|
| 441 |
+
|
| 442 |
+
**Problem**: Groq API returns authentication error
|
| 443 |
+
|
| 444 |
+
**Solution**:
|
| 445 |
+
- Verify key at [Groq Console](https://console.groq.com/keys)
|
| 446 |
+
- Check `.env` file has correct format: `GROQ_API_KEY=gsk_...`
|
| 447 |
+
- Restart application after updating key
|
| 448 |
+
|
| 449 |
+
### Debug Mode
|
| 450 |
+
|
| 451 |
+
Enable detailed logging:
|
| 452 |
+
|
| 453 |
+
```bash
|
| 454 |
export FLASK_DEBUG=1
|
| 455 |
+
export LANGCHAIN_VERBOSE=true
|
| 456 |
python app.py
|
| 457 |
```
|
| 458 |
|
| 459 |
+
---
|
| 460 |
+
|
| 461 |
+
## 🧪 Testing
|
| 462 |
+
|
| 463 |
+
```bash
|
| 464 |
+
# Run test suite
|
| 465 |
+
pytest tests/
|
| 466 |
+
|
| 467 |
+
# Test embedding model
|
| 468 |
+
python test_embeddings.py
|
| 469 |
+
|
| 470 |
+
# Test document processing
|
| 471 |
+
pytest tests/test_document_processor.py
|
| 472 |
+
|
| 473 |
+
# Integration tests
|
| 474 |
+
pytest tests/test_integration.py
|
| 475 |
+
```
|
| 476 |
+
|
| 477 |
+
---
|
| 478 |
+
|
| 479 |
+
## 🤝 Contributing
|
| 480 |
+
|
| 481 |
+
We welcome contributions! Please follow these steps:
|
| 482 |
+
|
| 483 |
+
1. Fork the repository
|
| 484 |
+
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
| 485 |
+
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
| 486 |
+
4. Push to the branch (`git push origin feature/amazing-feature`)
|
| 487 |
+
5. Open a Pull Request
|
| 488 |
+
|
| 489 |
+
### Development Guidelines
|
| 490 |
+
|
| 491 |
+
- Follow PEP 8 style guide
|
| 492 |
+
- Add tests for new features
|
| 493 |
+
- Update documentation
|
| 494 |
+
- Ensure Docker build succeeds
|
| 495 |
+
|
| 496 |
+
---
|
| 497 |
+
|
| 498 |
+
## 📝 Changelog
|
| 499 |
+
|
| 500 |
+
### Version 2.0 (October 2025)
|
| 501 |
+
|
| 502 |
+
✅ **Major Improvements**:
|
| 503 |
+
- Fixed Docker permission issues
|
| 504 |
+
- HuggingFace Spaces compatibility
|
| 505 |
+
- Enhanced error handling
|
| 506 |
+
- Multiple model loading fallbacks
|
| 507 |
+
- Improved security (non-root execution)
|
| 508 |
+
|
| 509 |
+
✅ **Bug Fixes**:
|
| 510 |
+
- Upload directory write permissions
|
| 511 |
+
- Cache directory access
|
| 512 |
+
- Model initialization reliability
|
| 513 |
+
|
| 514 |
+
### Version 1.0 (Initial Release)
|
| 515 |
+
|
| 516 |
+
- Basic RAG functionality
|
| 517 |
+
- PDF and DOCX support
|
| 518 |
+
- FAISS vector store
|
| 519 |
+
- Conversational memory
|
| 520 |
+
|
| 521 |
+
---
|
| 522 |
+
|
| 523 |
+
## 📄 License
|
| 524 |
+
|
| 525 |
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
| 526 |
+
|
| 527 |
+
---
|
| 528 |
+
|
| 529 |
+
## 🙏 Acknowledgments
|
| 530 |
+
|
| 531 |
+
- **LangChain** for RAG framework
|
| 532 |
+
- **Groq** for high-speed LLM inference
|
| 533 |
+
- **HuggingFace** for embeddings and hosting
|
| 534 |
+
- **FAISS** for efficient vector search
|
| 535 |
+
|
| 536 |
+
---
|
| 537 |
+
|
| 538 |
+
## 📞 Support
|
| 539 |
+
|
| 540 |
+
- **Issues**: [GitHub Issues](https://github.com/yourusername/cognichat/issues)
|
| 541 |
+
- **Discussions**: [GitHub Discussions](https://github.com/yourusername/cognichat/discussions)
|
| 542 |
+
- **Email**: riteshraut123321@gmail.com
|
| 543 |
+
|
| 544 |
+
---
|
| 545 |
+
|
| 546 |
+
<div align="center">
|
| 547 |
|
| 548 |
+
**Made with ❤️ by the CogniChat Team**
|
|
|
|
|
|
|
| 549 |
|
| 550 |
+
</div>
|
app.py
CHANGED
|
@@ -6,14 +6,13 @@ import uuid
|
|
| 6 |
from flask import Flask, request, render_template, session, jsonify, Response, stream_with_context
|
| 7 |
from werkzeug.utils import secure_filename
|
| 8 |
from rag_processor import create_rag_chain
|
|
|
|
| 9 |
|
| 10 |
-
# ============================ ADDITIONS START ============================
|
| 11 |
from gtts import gTTS
|
| 12 |
import io
|
| 13 |
-
import re
|
| 14 |
-
|
| 15 |
|
| 16 |
-
# Document Loaders
|
| 17 |
from langchain_community.document_loaders import (
|
| 18 |
TextLoader,
|
| 19 |
PyPDFLoader,
|
|
@@ -22,28 +21,56 @@ from langchain_community.document_loaders import (
|
|
| 22 |
|
| 23 |
# Additional imports for robust PDF handling
|
| 24 |
from langchain_core.documents import Document
|
| 25 |
-
import fitz
|
| 26 |
|
| 27 |
# Text Splitter, Embeddings, Retrievers
|
| 28 |
from langchain.text_splitter import RecursiveCharacterTextSplitter
|
| 29 |
-
from
|
| 30 |
from langchain_community.vectorstores import FAISS
|
| 31 |
-
from langchain.retrievers import EnsembleRetriever
|
|
|
|
| 32 |
from langchain_community.retrievers import BM25Retriever
|
| 33 |
from langchain_community.chat_message_histories import ChatMessageHistory
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
-
# --- Basic Flask App Setup ---
|
| 36 |
app = Flask(__name__)
|
| 37 |
app.config['SECRET_KEY'] = os.urandom(24)
|
| 38 |
|
| 39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
is_hf_spaces = bool(os.getenv("SPACE_ID") or os.getenv("SPACES_ZERO_GPU"))
|
| 41 |
if is_hf_spaces:
|
| 42 |
app.config['UPLOAD_FOLDER'] = '/tmp/uploads'
|
| 43 |
else:
|
| 44 |
app.config['UPLOAD_FOLDER'] = 'uploads'
|
| 45 |
|
| 46 |
-
|
| 47 |
try:
|
| 48 |
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
|
| 49 |
print(f"✓ Upload folder ready: {app.config['UPLOAD_FOLDER']}")
|
|
@@ -54,21 +81,23 @@ except Exception as e:
|
|
| 54 |
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
|
| 55 |
print(f"✓ Using fallback upload folder: {app.config['UPLOAD_FOLDER']}")
|
| 56 |
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
| 58 |
rag_chains = {}
|
| 59 |
message_histories = {}
|
| 60 |
|
| 61 |
-
# Load the embedding model once when the application starts for efficiency.
|
| 62 |
print("Loading embedding model...")
|
| 63 |
|
| 64 |
-
|
| 65 |
cache_base = os.path.expanduser("~/.cache") if os.path.expanduser("~") != "~" else "/tmp/hf_cache"
|
| 66 |
os.environ.setdefault('HF_HOME', f'{cache_base}/huggingface')
|
| 67 |
os.environ.setdefault('HF_HUB_CACHE', f'{cache_base}/huggingface/hub')
|
| 68 |
os.environ.setdefault('TRANSFORMERS_CACHE', f'{cache_base}/transformers')
|
| 69 |
os.environ.setdefault('SENTENCE_TRANSFORMERS_HOME', f'{cache_base}/sentence_transformers')
|
| 70 |
|
| 71 |
-
|
| 72 |
cache_dirs = [
|
| 73 |
os.environ['HF_HOME'],
|
| 74 |
os.environ['HF_HUB_CACHE'],
|
|
@@ -103,6 +132,8 @@ for cache_dir in cache_dirs:
|
|
| 103 |
except Exception as e:
|
| 104 |
print(f"Warning: Could not create {cache_dir}: {e}")
|
| 105 |
|
|
|
|
|
|
|
| 106 |
# Try loading embedding model with error handling and fallbacks
|
| 107 |
try:
|
| 108 |
print("Attempting to load embedding model...")
|
|
@@ -135,6 +166,13 @@ except Exception as e:
|
|
| 135 |
print(f"Final attempt failed: {e3}")
|
| 136 |
# Use a simpler fallback model or raise the error
|
| 137 |
raise Exception(f"Could not load any embedding model. Last error: {e3}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
|
| 139 |
def load_pdf_with_fallback(filepath):
|
| 140 |
"""
|
|
@@ -336,12 +374,19 @@ def upload_files():
|
|
| 336 |
retrievers=[bm25_retriever, faiss_retriever],
|
| 337 |
weights=[0.5, 0.5]
|
| 338 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 339 |
|
| 340 |
session_id = str(uuid.uuid4())
|
| 341 |
print(f"Creating RAG chain for session {session_id}...")
|
| 342 |
|
| 343 |
try:
|
| 344 |
-
rag_chain = create_rag_chain(
|
| 345 |
rag_chains[session_id] = rag_chain
|
| 346 |
print(f"✓ RAG chain created successfully for session {session_id} with {len(processed_files)} documents.")
|
| 347 |
except Exception as rag_error:
|
|
@@ -443,7 +488,6 @@ def chat():
|
|
| 443 |
print(f"Error during chat invocation: {e}")
|
| 444 |
return Response("An error occurred while getting the answer.", status=500, mimetype='text/plain')
|
| 445 |
|
| 446 |
-
# ============================ ADDITIONS START ============================
|
| 447 |
|
| 448 |
def clean_markdown_for_tts(text: str) -> str:
|
| 449 |
"""Removes markdown formatting for cleaner text-to-speech output."""
|
|
@@ -484,7 +528,7 @@ def text_to_speech():
|
|
| 484 |
except Exception as e:
|
| 485 |
print(f"Error in TTS generation: {e}")
|
| 486 |
return jsonify({'status': 'error', 'message': 'Failed to generate audio.'}), 500
|
| 487 |
-
|
| 488 |
|
| 489 |
|
| 490 |
@app.route('/debug', methods=['GET'])
|
|
|
|
| 6 |
from flask import Flask, request, render_template, session, jsonify, Response, stream_with_context
|
| 7 |
from werkzeug.utils import secure_filename
|
| 8 |
from rag_processor import create_rag_chain
|
| 9 |
+
from typing import Sequence, Any
|
| 10 |
|
|
|
|
| 11 |
from gtts import gTTS
|
| 12 |
import io
|
| 13 |
+
import re
|
| 14 |
+
|
| 15 |
|
|
|
|
| 16 |
from langchain_community.document_loaders import (
|
| 17 |
TextLoader,
|
| 18 |
PyPDFLoader,
|
|
|
|
| 21 |
|
| 22 |
# Additional imports for robust PDF handling
|
| 23 |
from langchain_core.documents import Document
|
| 24 |
+
import fitz
|
| 25 |
|
| 26 |
# Text Splitter, Embeddings, Retrievers
|
| 27 |
from langchain.text_splitter import RecursiveCharacterTextSplitter
|
| 28 |
+
from langchain_huggingface import HuggingFaceEmbeddings
|
| 29 |
from langchain_community.vectorstores import FAISS
|
| 30 |
+
from langchain.retrievers import EnsembleRetriever, ContextualCompressionRetriever
|
| 31 |
+
from langchain.retrievers.document_compressors.base import BaseDocumentCompressor
|
| 32 |
from langchain_community.retrievers import BM25Retriever
|
| 33 |
from langchain_community.chat_message_histories import ChatMessageHistory
|
| 34 |
+
from sentence_transformers.cross_encoder import CrossEncoder
|
| 35 |
+
import numpy as np
|
| 36 |
+
|
| 37 |
|
|
|
|
| 38 |
app = Flask(__name__)
|
| 39 |
app.config['SECRET_KEY'] = os.urandom(24)
|
| 40 |
|
| 41 |
+
|
| 42 |
+
class LocalReranker(BaseDocumentCompressor):
|
| 43 |
+
model: Any
|
| 44 |
+
top_n: int = 5
|
| 45 |
+
|
| 46 |
+
class Config:
|
| 47 |
+
arbitrary_types_allowed = True
|
| 48 |
+
|
| 49 |
+
def compress_documents(
|
| 50 |
+
self,
|
| 51 |
+
documents: Sequence[Document],
|
| 52 |
+
query: str,
|
| 53 |
+
callbacks=None,
|
| 54 |
+
) -> Sequence[Document]:
|
| 55 |
+
if not documents:
|
| 56 |
+
return []
|
| 57 |
+
|
| 58 |
+
pairs = [[query, doc.page_content] for doc in documents]
|
| 59 |
+
scores = self.model.predict(pairs, show_progress_bar=False)
|
| 60 |
+
|
| 61 |
+
doc_scores = list(zip(documents, scores))
|
| 62 |
+
sorted_doc_scores = sorted(doc_scores, key=lambda x: x[1], reverse=True)
|
| 63 |
+
|
| 64 |
+
return [doc for doc, score in sorted_doc_scores[:self.top_n]]
|
| 65 |
+
|
| 66 |
+
|
| 67 |
is_hf_spaces = bool(os.getenv("SPACE_ID") or os.getenv("SPACES_ZERO_GPU"))
|
| 68 |
if is_hf_spaces:
|
| 69 |
app.config['UPLOAD_FOLDER'] = '/tmp/uploads'
|
| 70 |
else:
|
| 71 |
app.config['UPLOAD_FOLDER'] = 'uploads'
|
| 72 |
|
| 73 |
+
|
| 74 |
try:
|
| 75 |
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
|
| 76 |
print(f"✓ Upload folder ready: {app.config['UPLOAD_FOLDER']}")
|
|
|
|
| 81 |
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
|
| 82 |
print(f"✓ Using fallback upload folder: {app.config['UPLOAD_FOLDER']}")
|
| 83 |
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
|
| 87 |
+
|
| 88 |
rag_chains = {}
|
| 89 |
message_histories = {}
|
| 90 |
|
|
|
|
| 91 |
print("Loading embedding model...")
|
| 92 |
|
| 93 |
+
|
| 94 |
cache_base = os.path.expanduser("~/.cache") if os.path.expanduser("~") != "~" else "/tmp/hf_cache"
|
| 95 |
os.environ.setdefault('HF_HOME', f'{cache_base}/huggingface')
|
| 96 |
os.environ.setdefault('HF_HUB_CACHE', f'{cache_base}/huggingface/hub')
|
| 97 |
os.environ.setdefault('TRANSFORMERS_CACHE', f'{cache_base}/transformers')
|
| 98 |
os.environ.setdefault('SENTENCE_TRANSFORMERS_HOME', f'{cache_base}/sentence_transformers')
|
| 99 |
|
| 100 |
+
|
| 101 |
cache_dirs = [
|
| 102 |
os.environ['HF_HOME'],
|
| 103 |
os.environ['HF_HUB_CACHE'],
|
|
|
|
| 132 |
except Exception as e:
|
| 133 |
print(f"Warning: Could not create {cache_dir}: {e}")
|
| 134 |
|
| 135 |
+
|
| 136 |
+
|
| 137 |
# Try loading embedding model with error handling and fallbacks
|
| 138 |
try:
|
| 139 |
print("Attempting to load embedding model...")
|
|
|
|
| 166 |
print(f"Final attempt failed: {e3}")
|
| 167 |
# Use a simpler fallback model or raise the error
|
| 168 |
raise Exception(f"Could not load any embedding model. Last error: {e3}")
|
| 169 |
+
|
| 170 |
+
|
| 171 |
+
|
| 172 |
+
print("Loading local re-ranking model...")
|
| 173 |
+
RERANKER_MODEL = CrossEncoder("mixedbread-ai/mxbai-rerank-xsmall-v1", device='cpu')
|
| 174 |
+
print("Re-ranking model loaded successfully.")
|
| 175 |
+
|
| 176 |
|
| 177 |
def load_pdf_with_fallback(filepath):
|
| 178 |
"""
|
|
|
|
| 374 |
retrievers=[bm25_retriever, faiss_retriever],
|
| 375 |
weights=[0.5, 0.5]
|
| 376 |
)
|
| 377 |
+
reranker = LocalReranker(model=RERANKER_MODEL, top_n=3)
|
| 378 |
+
|
| 379 |
+
compression_retriever = ContextualCompressionRetriever(
|
| 380 |
+
base_compressor=reranker,
|
| 381 |
+
base_retriever=ensemble_retriever
|
| 382 |
+
)
|
| 383 |
+
|
| 384 |
|
| 385 |
session_id = str(uuid.uuid4())
|
| 386 |
print(f"Creating RAG chain for session {session_id}...")
|
| 387 |
|
| 388 |
try:
|
| 389 |
+
rag_chain = create_rag_chain(compression_retriever, get_session_history)
|
| 390 |
rag_chains[session_id] = rag_chain
|
| 391 |
print(f"✓ RAG chain created successfully for session {session_id} with {len(processed_files)} documents.")
|
| 392 |
except Exception as rag_error:
|
|
|
|
| 488 |
print(f"Error during chat invocation: {e}")
|
| 489 |
return Response("An error occurred while getting the answer.", status=500, mimetype='text/plain')
|
| 490 |
|
|
|
|
| 491 |
|
| 492 |
def clean_markdown_for_tts(text: str) -> str:
|
| 493 |
"""Removes markdown formatting for cleaner text-to-speech output."""
|
|
|
|
| 528 |
except Exception as e:
|
| 529 |
print(f"Error in TTS generation: {e}")
|
| 530 |
return jsonify({'status': 'error', 'message': 'Failed to generate audio.'}), 500
|
| 531 |
+
|
| 532 |
|
| 533 |
|
| 534 |
@app.route('/debug', methods=['GET'])
|
rag_processor.py
CHANGED
|
@@ -83,6 +83,7 @@ Standalone Question:"""
|
|
| 83 |
rag_template = """You are an expert assistant named `Cognichat`.Whenver user ask you about who you are , simply say you are `Cognichat`.
|
| 84 |
You are developed by Ritesh and Alish.
|
| 85 |
Your job is to provide accurate and helpful answers based ONLY on the provided context.
|
|
|
|
| 86 |
If the information is not in the context, clearly state that you don't know the answer.
|
| 87 |
Provide a clear and concise answer.
|
| 88 |
|
|
|
|
| 83 |
rag_template = """You are an expert assistant named `Cognichat`.Whenver user ask you about who you are , simply say you are `Cognichat`.
|
| 84 |
You are developed by Ritesh and Alish.
|
| 85 |
Your job is to provide accurate and helpful answers based ONLY on the provided context.
|
| 86 |
+
Whatever the user ask,it is always about the document so based on the document only provide the answer.
|
| 87 |
If the information is not in the context, clearly state that you don't know the answer.
|
| 88 |
Provide a clear and concise answer.
|
| 89 |
|