Spaces:

Zeri00
/

Cogni-chat-document-reader

Sleeping

App Files Files Community

Cogni-chat-document-reader / README.md

riteshraut

feat/cross encoder reranker

dcd10b3 4 months ago

preview code

raw

history blame contribute delete

13 kB

	---
	title: CogniChat - Chat with Your Documents
	emoji: 🤖
	colorFrom: blue
	colorTo: purple
	sdk: docker
	pinned: false
	license: mit
	app_port: 7860
	---
	# 🤖 CogniChat - Intelligent Document Chat System

	<div align="center">

	![License](https://img.shields.io/badge/license-MIT-blue.svg)
	![Python](https://img.shields.io/badge/python-3.9+-brightgreen.svg)
	![Docker](https://img.shields.io/badge/docker-ready-blue.svg)
	![HuggingFace](https://img.shields.io/badge/🤗-Spaces-yellow.svg)

	Transform your documents into interactive conversations powered by advanced RAG technology

	<p align="center">
	<img src="Document_reader.gif" width="100%" alt="CogniChat Demo">
	</p>

	[Features](#-features) • [Quick Start](#-quick-start) • [Architecture](#-architecture) • [Deployment](#-deployment) • [API](#-api-reference)

	</div>

	---

	## 📋 Table of Contents

	- [Overview](#-overview)
	- [Features](#-features)
	- [Architecture](#-architecture)
	- [Technology Stack](#-technology-stack)
	- [Quick Start](#-quick-start)
	- [Deployment](#-deployment)
	- [Configuration](#-configuration)
	- [API Reference](#-api-reference)
	- [Troubleshooting](#-troubleshooting)
	- [Contributing](#-contributing)
	- [License](#-license)

	---

	## 🎯 Overview

	CogniChat is a production-ready, intelligent document chat application that leverages Retrieval Augmented Generation (RAG) to enable natural conversations with your documents. Built with enterprise-grade technologies, it provides accurate, context-aware responses from your document corpus.

	### Why CogniChat?


	- 🔉 Audio Overview of Your document:Simply ask the question and listen the audio. Now your document can speak with you.
	- 🎯 Accurate Retrieval: Hybrid search combining BM25 and FAISS for optimal results
	- 💬 Conversational Memory: Maintains context across multiple interactions
	- 📄 Multi-Format Support: Handles PDF, DOCX, TXT, and image files
	- 🚀 Production Ready: Docker support, comprehensive error handling, and security best practices
	- 🎨 Modern UI: Responsive design with dark mode and real-time streaming

	---

	## ✨ Features

	### Core Capabilities

	\| Feature \| Description \|
	\|---------\|-------------\|
	\| Multi-Format Processing \| Upload and process PDF, DOCX, TXT, and image files \|
	\| Hybrid Search \| Combines BM25 (keyword) and FAISS (semantic) for superior retrieval \|
	\| Conversational AI \| Powered by Groq's Llama 3.1 for intelligent responses \|
	\| Memory Management \| Maintains chat history for contextual conversations \|
	\| Text-to-Speech \| Built-in TTS for audio playback of responses \|
	\| Streaming Responses \| Real-time token streaming for better UX \|
	\| Document Chunking \| Intelligent text splitting for optimal context windows \|

	### Advanced Features

	- Semantic Embeddings: HuggingFace `all-miniLM-L6-v2` for accurate vector representations
	- Reranking: Contextual compression for improved relevance
	- Error Handling: Comprehensive fallback mechanisms and error recovery
	- Security: Non-root Docker execution and environment-based secrets
	- Scalability: Optimized for both local and cloud deployments

	---

	## 🏗 Architecture

	### RAG Pipeline Overview

	```mermaid
	graph TB
	A[Document Upload] --> B[Document Processing]
	B --> C[Text Extraction]
	C --> D[Chunking Strategy]
	D --> E[Embedding Generation]
	E --> F[Vector Store FAISS]

	G[User Query] --> H[Query Embedding]
	H --> I[Hybrid Retrieval]

	F --> I
	J[BM25 Index] --> I

	I --> K[Reranking]
	K --> L[Context Assembly]
	L --> M[LLM Groq Llama 3.1]
	M --> N[Response Generation]
	N --> O[Streaming Output]

	P[Chat History] --> M
	N --> P

	style A fill:#e1f5ff
	style G fill:#e1f5ff
	style F fill:#ffe1f5
	style J fill:#ffe1f5
	style M fill:#f5e1ff
	style O fill:#e1ffe1
	```

	### System Architecture

	```mermaid
	graph LR
	A[Client Browser] -->\|HTTP/WebSocket\| B[Flask Server]
	B --> C[Document Processor]
	B --> D[RAG Engine]
	B --> E[TTS Service]

	C --> F[(File Storage)]
	D --> G[(FAISS Vector DB)]
	D --> H[(BM25 Index)]
	D --> I[Groq API]

	J[HuggingFace Models] --> D

	style B fill:#4a90e2
	style D fill:#e24a90
	style I fill:#90e24a
	```

	### Data Flow

	1. Document Ingestion: Files are uploaded and validated
	2. Processing Pipeline: Text extraction → Chunking → Embedding
	3. Indexing: Dual indexing (FAISS + BM25) for hybrid search
	4. Query Processing: User queries are embedded and searched
	5. Retrieval: Top-k relevant chunks retrieved using hybrid approach
	6. Generation: LLM generates contextual responses with citations
	7. Streaming: Responses streamed back to client in real-time

	---

	## 🛠 Technology Stack

	### Backend

	\| Component \| Technology \| Purpose \|
	\|-----------\|-----------\|---------\|
	\| Framework \| Flask 2.3+ \| Web application framework \|
	\| RAG \| LangChain \| RAG pipeline orchestration \|
	\| Vector DB \| FAISS \| Fast similarity search \|
	\| Keyword Search \| BM25 \| Sparse retrieval \|
	\| LLM \| Groq Llama 3.1 \| Response generation \|
	\| Embeddings \| HuggingFace Transformers \| Semantic embeddings \|
	\| Doc Processing \| Unstructured, PyPDF, python-docx \| Multi-format parsing \|

	### Frontend

	\| Component \| Technology \|
	\|-----------\|-----------\|
	\| UI Framework \| TailwindCSS \|
	\| JavaScript \| Vanilla ES6+ \|
	\| Icons \| Font Awesome \|
	\| Markdown \| Marked.js \|

	### Infrastructure

	- Containerization: Docker + Docker Compose
	- Deployment: HuggingFace Spaces, local, cloud-agnostic
	- Security: Environment-based secrets, non-root execution

	---

	## 🚀 Quick Start

	### Prerequisites

	- Python 3.9+
	- Docker (optional, recommended)
	- Groq API Key ([Get one here](https://console.groq.com/keys))

	### Installation Methods

	#### 🐳 Method 1: Docker (Recommended)

	```bash
	# Clone the repository
	git clone https://github.com/RautRitesh/Chat-with-docs
	cd cognichat

	# Create environment file
	cp .env.example .env

	# Add your Groq API key to .env
	echo "GROQ_API_KEY=your_actual_api_key_here" >> .env

	# Build and run with Docker Compose
	docker-compose up -d

	# Or build manually
	docker build -t cognichat .
	docker run -p 7860:7860 --env-file .env cognichat
	```

	#### 🐍 Method 2: Local Python Environment

	```bash
	# Clone the repository
	git clone https://github.com/RautRitesh/Chat-with-docs
	cd cognichat

	# Create virtual environment
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate

	# Install dependencies
	pip install -r requirements.txt

	# Set environment variables
	export GROQ_API_KEY=your_actual_api_key_here

	# Run the application
	python app.py
	```

	#### 🤗 Method 3: HuggingFace Spaces

	1. Fork this repository
	2. Create a new Space on [HuggingFace](https://huggingface.co/spaces)
	3. Link your forked repository
	4. Add `GROQ_API_KEY` in Settings → Repository Secrets
	5. Space will auto-deploy!

	### First Steps

	1. Open `http://localhost:7860` in your browser
	2. Upload a document (PDF, DOCX, TXT, or image)
	3. Wait for processing (progress indicator will show status)
	4. Start chatting with your document!
	5. Use the 🔊 button to hear responses via TTS

	---

	## 📦 Deployment

	### Environment Variables

	Create a `.env` file with the following variables:

	```bash
	# Required
	GROQ_API_KEY=your_groq_api_key_here

	# Optional
	PORT=7860
	HF_HOME=/tmp/huggingface_cache # For HF Spaces
	FLASK_DEBUG=0 # Set to 1 for development
	MAX_UPLOAD_SIZE=10485760 # 10MB default
	```

	### Docker Deployment

	```bash
	# Production build
	docker build -t cognichat:latest .

	# Run with resource limits
	docker run -d \
	--name cognichat \
	-p 7860:7860 \
	--env-file .env \
	--memory="2g" \
	--cpus="1.5" \
	cognichat:latest
	```

	### Docker Compose

	```yaml
	version: '3.8'

	services:
	cognichat:
	build: .
	ports:
	- "7860:7860"
	environment:
	- GROQ_API_KEY=${GROQ_API_KEY}
	volumes:
	- ./data:/app/data
	restart: unless-stopped
	```

	### HuggingFace Spaces Configuration

	Add these files to your repository:

	app_port in `README.md` header:
	```yaml
	app_port: 7860
	```

	Repository Secrets:
	- `GROQ_API_KEY`: Your Groq API key

	The application automatically detects HF Spaces environment and adjusts paths accordingly.

	---

	## ⚙️ Configuration

	### Document Processing Settings

	```python
	# In app.py - Customize these settings
	CHUNK_SIZE = 1000 # Characters per chunk
	CHUNK_OVERLAP = 200 # Overlap between chunks
	EMBEDDING_MODEL = "sentence-transformers/all-miniLM-L6-v2"
	RETRIEVER_K = 5 # Number of chunks to retrieve
	```

	### Model Configuration

	```python
	# LLM Settings
	LLM_PROVIDER = "groq"
	MODEL_NAME = "llama-3.1-70b-versatile"
	TEMPERATURE = 0.7
	MAX_TOKENS = 2048
	```

	### Search Configuration

	```python
	# Hybrid Search Weights
	FAISS_WEIGHT = 0.6 # Semantic search weight
	BM25_WEIGHT = 0.4 # Keyword search weight
	```

	---

	## 📚 API Reference

	### Endpoints

	#### Upload Document

	```http
	POST /upload
	Content-Type: multipart/form-data

	{
	"file": <binary>
	}
	```

	Response:
	```json
	{
	"status": "success",
	"message": "Document processed successfully",
	"filename": "example.pdf",
	"chunks": 45
	}
	```

	#### Chat

	```http
	POST /chat
	Content-Type: application/json

	{
	"message": "What is the main topic?",
	"stream": true
	}
	```

	Response (Streaming):
	```
	data: {"token": "The", "done": false}
	data: {"token": " main", "done": false}
	data: {"token": " topic", "done": false}
	data: {"done": true}
	```

	#### Clear Session

	```http
	POST /clear
	```

	Response:
	```json
	{
	"status": "success",
	"message": "Session cleared"
	}
	```

	---

	## 🔧 Troubleshooting

	### Common Issues

	#### 1. Permission Errors in Docker

	Problem: `Permission denied` when writing to cache directories

	Solution:
	```bash
	# Rebuild with proper permissions
	docker build --no-cache -t cognichat .

	# Or run with volume permissions
	docker run -v $(pwd)/cache:/tmp/huggingface_cache \
	--user $(id -u):$(id -g) \
	cognichat
	```

	#### 2. Model Loading Fails

	Problem: Cannot download HuggingFace models

	Solution:
	```bash
	# Pre-download models
	python test_embeddings.py

	# Or use HF_HOME environment variable
	export HF_HOME=/path/to/writable/directory
	```

	#### 3. Chat Returns 400 Error

	Problem: Upload directory not writable (common in HF Spaces)

	Solution: Application now automatically uses `/tmp/uploads` in HF Spaces environment. Ensure latest version is deployed.

	#### 4. API Key Invalid

	Problem: Groq API returns authentication error

	Solution:
	- Verify key at [Groq Console](https://console.groq.com/keys)
	- Check `.env` file has correct format: `GROQ_API_KEY=gsk_...`
	- Restart application after updating key

	### Debug Mode

	Enable detailed logging:

	```bash
	export FLASK_DEBUG=1
	export LANGCHAIN_VERBOSE=true
	python app.py
	```

	---

	## 🧪 Testing

	```bash
	# Run test suite
	pytest tests/

	# Test embedding model
	python test_embeddings.py

	# Test document processing
	pytest tests/test_document_processor.py

	# Integration tests
	pytest tests/test_integration.py
	```

	---

	## 🤝 Contributing

	We welcome contributions! Please follow these steps:

	1. Fork the repository
	2. Create a feature branch (`git checkout -b feature/amazing-feature`)
	3. Commit your changes (`git commit -m 'Add amazing feature'`)
	4. Push to the branch (`git push origin feature/amazing-feature`)
	5. Open a Pull Request

	### Development Guidelines

	- Follow PEP 8 style guide
	- Add tests for new features
	- Update documentation
	- Ensure Docker build succeeds

	---

	## 📝 Changelog

	### Version 2.0 (October 2025)

	✅ Major Improvements:
	- Fixed Docker permission issues
	- HuggingFace Spaces compatibility
	- Enhanced error handling
	- Multiple model loading fallbacks
	- Improved security (non-root execution)

	✅ Bug Fixes:
	- Upload directory write permissions
	- Cache directory access
	- Model initialization reliability

	### Version 1.0 (Initial Release)

	- Basic RAG functionality
	- PDF and DOCX support
	- FAISS vector store
	- Conversational memory

	---

	## 📄 License

	This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

	---

	## 🙏 Acknowledgments

	- LangChain for RAG framework
	- Groq for high-speed LLM inference
	- HuggingFace for embeddings and hosting
	- FAISS for efficient vector search

	---

	## 📞 Support

	- Issues: [GitHub Issues](https://github.com/yourusername/cognichat/issues)
	- Discussions: [GitHub Discussions](https://github.com/yourusername/cognichat/discussions)
	- Email: riteshraut123321@gmail.com

	---

	<div align="center">

	Made with ❤️ by the CogniChat Team

	[⭐ Star us on GitHub](https://github.com/yourusername/cognichat) • [🐛 Report Bug](https://github.com/yourusername/cognichat/issues) • [✨ Request Feature](https://github.com/yourusername/cognichat/issues)

	</div>