---
title: CogniChat - Chat with Your Documents
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860
---
# 🤖 CogniChat - Intelligent Document Chat System

<div align="center">

![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Python](https://img.shields.io/badge/python-3.9+-brightgreen.svg)
![Docker](https://img.shields.io/badge/docker-ready-blue.svg)
![HuggingFace](https://img.shields.io/badge/🤗-Spaces-yellow.svg)

**Transform your documents into interactive conversations powered by advanced RAG technology**

<p align="center">
  <img src="Document_reader.gif" width="100%" alt="CogniChat Demo">
</p>

[Features](#-features) • [Quick Start](#-quick-start) • [Architecture](#-architecture) • [Deployment](#-deployment) • [API](#-api-reference)

</div>

---

## 📋 Table of Contents

- [Overview](#-overview)
- [Features](#-features)
- [Architecture](#-architecture)
- [Technology Stack](#-technology-stack)
- [Quick Start](#-quick-start)
- [Deployment](#-deployment)
- [Configuration](#-configuration)
- [API Reference](#-api-reference)
- [Troubleshooting](#-troubleshooting)
- [Contributing](#-contributing)
- [License](#-license)

---

## 🎯 Overview

CogniChat is a production-ready, intelligent document chat application that leverages **Retrieval Augmented Generation (RAG)** to enable natural conversations with your documents. Built with enterprise-grade technologies, it provides accurate, context-aware responses from your document corpus.

### Why CogniChat?


- **🔉 Audio Overview of Your document**:Simply ask the question and listen the audio. Now your document can speak with you.
- **🎯 Accurate Retrieval**: Hybrid search combining BM25 and FAISS for optimal results
- **💬 Conversational Memory**: Maintains context across multiple interactions
- **📄 Multi-Format Support**: Handles PDF, DOCX, TXT, and image files
- **🚀 Production Ready**: Docker support, comprehensive error handling, and security best practices
- **🎨 Modern UI**: Responsive design with dark mode and real-time streaming

---

## ✨ Features

### Core Capabilities

| Feature | Description |
|---------|-------------|
| **Multi-Format Processing** | Upload and process PDF, DOCX, TXT, and image files |
| **Hybrid Search** | Combines BM25 (keyword) and FAISS (semantic) for superior retrieval |
| **Conversational AI** | Powered by Groq's Llama 3.1 for intelligent responses |
| **Memory Management** | Maintains chat history for contextual conversations |
| **Text-to-Speech** | Built-in TTS for audio playback of responses |
| **Streaming Responses** | Real-time token streaming for better UX |
| **Document Chunking** | Intelligent text splitting for optimal context windows |

### Advanced Features

- **Semantic Embeddings**: HuggingFace `all-miniLM-L6-v2` for accurate vector representations
- **Reranking**: Contextual compression for improved relevance
- **Error Handling**: Comprehensive fallback mechanisms and error recovery
- **Security**: Non-root Docker execution and environment-based secrets
- **Scalability**: Optimized for both local and cloud deployments

---

## 🏗 Architecture

### RAG Pipeline Overview

```mermaid
graph TB
    A[Document Upload] --> B[Document Processing]
    B --> C[Text Extraction]
    C --> D[Chunking Strategy]
    D --> E[Embedding Generation]
    E --> F[Vector Store FAISS]
    
    G[User Query] --> H[Query Embedding]
    H --> I[Hybrid Retrieval]
    
    F --> I
    J[BM25 Index] --> I
    
    I --> K[Reranking]
    K --> L[Context Assembly]
    L --> M[LLM Groq Llama 3.1]
    M --> N[Response Generation]
    N --> O[Streaming Output]
    
    P[Chat History] --> M
    N --> P
    
    style A fill:#e1f5ff
    style G fill:#e1f5ff
    style F fill:#ffe1f5
    style J fill:#ffe1f5
    style M fill:#f5e1ff
    style O fill:#e1ffe1
```

### System Architecture

```mermaid
graph LR
    A[Client Browser] -->|HTTP/WebSocket| B[Flask Server]
    B --> C[Document Processor]
    B --> D[RAG Engine]
    B --> E[TTS Service]
    
    C --> F[(File Storage)]
    D --> G[(FAISS Vector DB)]
    D --> H[(BM25 Index)]
    D --> I[Groq API]
    
    J[HuggingFace Models] --> D
    
    style B fill:#4a90e2
    style D fill:#e24a90
    style I fill:#90e24a
```

### Data Flow

1. **Document Ingestion**: Files are uploaded and validated
2. **Processing Pipeline**: Text extraction → Chunking → Embedding
3. **Indexing**: Dual indexing (FAISS + BM25) for hybrid search
4. **Query Processing**: User queries are embedded and searched
5. **Retrieval**: Top-k relevant chunks retrieved using hybrid approach
6. **Generation**: LLM generates contextual responses with citations
7. **Streaming**: Responses streamed back to client in real-time

---

## 🛠 Technology Stack

### Backend

| Component | Technology | Purpose |
|-----------|-----------|---------|
| **Framework** | Flask 2.3+ | Web application framework |
| **RAG** | LangChain | RAG pipeline orchestration |
| **Vector DB** | FAISS | Fast similarity search |
| **Keyword Search** | BM25 | Sparse retrieval |
| **LLM** | Groq Llama 3.1 | Response generation |
| **Embeddings** | HuggingFace Transformers | Semantic embeddings |
| **Doc Processing** | Unstructured, PyPDF, python-docx | Multi-format parsing |

### Frontend

| Component | Technology |
|-----------|-----------|
| **UI Framework** | TailwindCSS |
| **JavaScript** | Vanilla ES6+ |
| **Icons** | Font Awesome |
| **Markdown** | Marked.js |

### Infrastructure

- **Containerization**: Docker + Docker Compose
- **Deployment**: HuggingFace Spaces, local, cloud-agnostic
- **Security**: Environment-based secrets, non-root execution

---

## 🚀 Quick Start

### Prerequisites

- Python 3.9+
- Docker (optional, recommended)
- Groq API Key ([Get one here](https://console.groq.com/keys))

### Installation Methods

#### 🐳 Method 1: Docker (Recommended)

```bash
# Clone the repository
git clone https://github.com/RautRitesh/Chat-with-docs
cd cognichat

# Create environment file
cp .env.example .env

# Add your Groq API key to .env
echo "GROQ_API_KEY=your_actual_api_key_here" >> .env

# Build and run with Docker Compose
docker-compose up -d

# Or build manually
docker build -t cognichat .
docker run -p 7860:7860 --env-file .env cognichat
```

#### 🐍 Method 2: Local Python Environment

```bash
# Clone the repository
git clone https://github.com/RautRitesh/Chat-with-docs
cd cognichat

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export GROQ_API_KEY=your_actual_api_key_here

# Run the application
python app.py
```

#### 🤗 Method 3: HuggingFace Spaces

1. Fork this repository
2. Create a new Space on [HuggingFace](https://huggingface.co/spaces)
3. Link your forked repository
4. Add `GROQ_API_KEY` in Settings → Repository Secrets
5. Space will auto-deploy!

### First Steps

1. Open `http://localhost:7860` in your browser
2. Upload a document (PDF, DOCX, TXT, or image)
3. Wait for processing (progress indicator will show status)
4. Start chatting with your document!
5. Use the 🔊 button to hear responses via TTS

---

## 📦 Deployment

### Environment Variables

Create a `.env` file with the following variables:

```bash
# Required
GROQ_API_KEY=your_groq_api_key_here

# Optional
PORT=7860
HF_HOME=/tmp/huggingface_cache  # For HF Spaces
FLASK_DEBUG=0  # Set to 1 for development
MAX_UPLOAD_SIZE=10485760  # 10MB default
```

### Docker Deployment

```bash
# Production build
docker build -t cognichat:latest .

# Run with resource limits
docker run -d \
  --name cognichat \
  -p 7860:7860 \
  --env-file .env \
  --memory="2g" \
  --cpus="1.5" \
  cognichat:latest
```

### Docker Compose

```yaml
version: '3.8'

services:
  cognichat:
    build: .
    ports:
      - "7860:7860"
    environment:
      - GROQ_API_KEY=${GROQ_API_KEY}
    volumes:
      - ./data:/app/data
    restart: unless-stopped
```

### HuggingFace Spaces Configuration

Add these files to your repository:

**app_port** in `README.md` header:
```yaml
app_port: 7860
```

**Repository Secrets**:
- `GROQ_API_KEY`: Your Groq API key

The application automatically detects HF Spaces environment and adjusts paths accordingly.

---

## ⚙️ Configuration

### Document Processing Settings

```python
# In app.py - Customize these settings
CHUNK_SIZE = 1000  # Characters per chunk
CHUNK_OVERLAP = 200  # Overlap between chunks
EMBEDDING_MODEL = "sentence-transformers/all-miniLM-L6-v2"
RETRIEVER_K = 5  # Number of chunks to retrieve
```

### Model Configuration

```python
# LLM Settings
LLM_PROVIDER = "groq"
MODEL_NAME = "llama-3.1-70b-versatile"
TEMPERATURE = 0.7
MAX_TOKENS = 2048
```

### Search Configuration

```python
# Hybrid Search Weights
FAISS_WEIGHT = 0.6  # Semantic search weight
BM25_WEIGHT = 0.4   # Keyword search weight
```

---

## 📚 API Reference

### Endpoints

#### Upload Document

```http
POST /upload
Content-Type: multipart/form-data

{
  "file": <binary>
}
```

**Response**:
```json
{
  "status": "success",
  "message": "Document processed successfully",
  "filename": "example.pdf",
  "chunks": 45
}
```

#### Chat

```http
POST /chat
Content-Type: application/json

{
  "message": "What is the main topic?",
  "stream": true
}
```

**Response** (Streaming):
```
data: {"token": "The", "done": false}
data: {"token": " main", "done": false}
data: {"token": " topic", "done": false}
data: {"done": true}
```

#### Clear Session

```http
POST /clear
```

**Response**:
```json
{
  "status": "success",
  "message": "Session cleared"
}
```

---

## 🔧 Troubleshooting

### Common Issues

#### 1. Permission Errors in Docker

**Problem**: `Permission denied` when writing to cache directories

**Solution**:
```bash
# Rebuild with proper permissions
docker build --no-cache -t cognichat .

# Or run with volume permissions
docker run -v $(pwd)/cache:/tmp/huggingface_cache \
  --user $(id -u):$(id -g) \
  cognichat
```

#### 2. Model Loading Fails

**Problem**: Cannot download HuggingFace models

**Solution**:
```bash
# Pre-download models
python test_embeddings.py

# Or use HF_HOME environment variable
export HF_HOME=/path/to/writable/directory
```

#### 3. Chat Returns 400 Error

**Problem**: Upload directory not writable (common in HF Spaces)

**Solution**: Application now automatically uses `/tmp/uploads` in HF Spaces environment. Ensure latest version is deployed.

#### 4. API Key Invalid

**Problem**: Groq API returns authentication error

**Solution**:
- Verify key at [Groq Console](https://console.groq.com/keys)
- Check `.env` file has correct format: `GROQ_API_KEY=gsk_...`
- Restart application after updating key

### Debug Mode

Enable detailed logging:

```bash
export FLASK_DEBUG=1
export LANGCHAIN_VERBOSE=true
python app.py
```

---

## 🧪 Testing

```bash
# Run test suite
pytest tests/

# Test embedding model
python test_embeddings.py

# Test document processing
pytest tests/test_document_processor.py

# Integration tests
pytest tests/test_integration.py
```

---

## 🤝 Contributing

We welcome contributions! Please follow these steps:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

### Development Guidelines

- Follow PEP 8 style guide
- Add tests for new features
- Update documentation
- Ensure Docker build succeeds

---

## 📝 Changelog

### Version 2.0 (October 2025)

✅ **Major Improvements**:
- Fixed Docker permission issues
- HuggingFace Spaces compatibility
- Enhanced error handling
- Multiple model loading fallbacks
- Improved security (non-root execution)

✅ **Bug Fixes**:
- Upload directory write permissions
- Cache directory access
- Model initialization reliability

### Version 1.0 (Initial Release)

- Basic RAG functionality
- PDF and DOCX support
- FAISS vector store
- Conversational memory

---

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## 🙏 Acknowledgments

- **LangChain** for RAG framework
- **Groq** for high-speed LLM inference
- **HuggingFace** for embeddings and hosting
- **FAISS** for efficient vector search

---

## 📞 Support

- **Issues**: [GitHub Issues](https://github.com/yourusername/cognichat/issues)
- **Discussions**: [GitHub Discussions](https://github.com/yourusername/cognichat/discussions)
- **Email**: riteshraut123321@gmail.com

---

<div align="center">

**Made with ❤️ by the CogniChat Team**

[⭐ Star us on GitHub](https://github.com/yourusername/cognichat) • [🐛 Report Bug](https://github.com/yourusername/cognichat/issues) • [✨ Request Feature](https://github.com/yourusername/cognichat/issues)

</div>