Spaces:

nothingworry
/

IntegraChat

Sleeping

File size: 51,943 Bytes

---
title: IntegraChat
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.20.0"
app_file: app.py
pinned: false
---

# IntegraChat — Enterprise MCP Autonomous Agent Platform

**Track:** MCP in Action  
**Category:** Enterprise  
**Tag:** `mcp-in-action-track-enterprise`

---

## 📋 Table of Contents

- [Overview](#overview)
- [Quick Start](#quick-start)
- [Features](#features)
- [Conversation Memory System](#conversation-memory-system)
- [Role-Based Access Control (RBAC)](#role-based-access-control-rbac)
- [Installation & Setup](#installation--setup)
- [Usage](#usage)
- [API Endpoints](#api-endpoints)
- [Architecture](#architecture)
- [Supabase Setup & Migration](#supabase-setup--migration)
- [Troubleshooting](#troubleshooting)
- [Testing & Diagnostics](#testing--diagnostics)
- [Technical Stack](#technical-stack)
- [License](#license)

---

## Overview

**IntegraChat** is an enterprise-grade, multi-tenant AI platform that demonstrates the full capabilities of the **Model Context Protocol (MCP)** in a production-style environment. Built with enterprise governance and observability in mind, IntegraChat combines autonomous tool-using agents, RAG retrieval, live web search, and admin compliance under strict tenant isolation.

This platform showcases how MCP can power intelligent, governed, multi-tenant AI systems with real-time analytics, regex-based red-flag detection, and comprehensive tool orchestration.

---

## 🚀 Quick Start

### Windows Users
```bash
# 1. Install dependencies
pip install -r requirements.txt

# 2. Configure environment (copy and edit .env)
cp env.example .env
# Edit .env with your credentials (Supabase, LLM, etc.)

# 3. Start all services
start.bat
```

### Manual Setup
```bash
# 1. Install dependencies
pip install -r requirements.txt

# 2. Configure environment
cp env.example .env
# Edit .env with your credentials

# 3. Start FastAPI backend (Terminal 1)
uvicorn backend.api.main:app --port 8000 --reload

# 4. Start unified MCP server (Terminal 2)
python backend/mcp_server/server.py

# 5. Start Gradio UI (Terminal 3)
python app.py
```

Then access:
- **Gradio UI**: `http://localhost:7860`
- **FastAPI Docs**: `http://localhost:8000/docs`

> **Security Note:** REST requests that hit protected endpoints must include both `x-tenant-id` and `x-user-role` headers. Roles (`viewer`, `editor`, `admin`, `owner`) determine which actions—such as document ingestion, rule uploads, or analytics access—the caller may perform.

---

## Features

### Core Capabilities

- 🤖 **Autonomous Multi-Step MCP Agents** – Intelligent tool-aware agent that plans and executes multi-step workflows across RAG, Web, Admin, and LLM tools with short-term conversation memory
- 💭 **Short-Term Conversation Memory** – Automatic memory system that stores the last N tool outputs per session with configurable expiration (default: 10 outputs, 15 minutes TTL). Memory is keyed by session_id (not tenant_id) for safety, enabling better context awareness in multi-step workflows. Memory is automatically injected into tool payloads and cleared on session end.
- 📚 **Enhanced Knowledge Base Management** – Upload raw text, URLs, or documents (PDF/DOCX/TXT/MD) with rich metadata (source URL, timestamp, document type) and optimized chunking (400-600 tokens)
- 🤖 **AI-Generated KB Metadata** – Automatic extraction of title, summary, tags, topics, date, and quality score during document ingestion. LLM-powered with intelligent fallback when unavailable - uses keyword extraction and pattern matching to provide useful metadata even during timeouts
- 🔍 **Optimized RAG Search with Cross-Encoder Re-ranking** – Two-stage retrieval: initial vector search followed by cross-encoder re-ranking of top candidates using `cross-encoder/ms-marco-MiniLM-L-6-v2` for massive accuracy improvement. Semantic search with configurable similarity threshold (default 0.3) for better recall
- ⚡ **Per-Tool Latency Prediction** – Agent estimates expected latency before choosing tools (RAG: 60-120ms, Web: 400-1800ms, Admin: <20ms) to optimize tool selection and choose the fastest path
- 🧠 **Context-Aware MCP Routing** – Intelligent tool selection based on previous outputs: skip web search if RAG returns high score (≥0.8), skip agent reasoning for critical admin violations, skip RAG if relevant memory already available. Leads to more sophisticated behavior and higher scores
- 📋 **Tool Output Schemas** – Every tool returns strict JSON type schemas for easier debugging, cleaner reasoning, and more polished responses. Automatic schema validation and formatting
- 🗑️ **Document Management** – Delete individual documents or bulk delete all documents for a tenant with confirmation dialogs
- 🛡️ **Enterprise Admin Governance** – Advanced rule management system with:
  - Regex-based red-flag pattern matching with severity levels (low/medium/high/critical)
  - Automatic admin alerts for violations
  - **LLM-Enhanced Rules**: Rules are automatically analyzed and enhanced to identify edge cases, improve regex patterns, and suggest appropriate severity levels
  - **LLM-Guided Rule Explanations**: Automatic generation of human-readable explanations, concrete examples, and missing pattern suggestions. Includes intelligent fallback when LLM is unavailable - uses keyword extraction to provide useful explanations even during timeouts
  - **File Upload Support**: Upload rules from TXT, PDF, DOC, or DOCX files with drag-and-drop interface
  - **Chunk Processing**: Large rule sets processed in manageable chunks (5 rules at a time) to prevent timeouts
  - **Rule-Based Behavior Control**: Rules checked FIRST - brief response rules return quick answers, blocking rules prevent requests
  - **Comment Filtering**: Comment lines (starting with #) automatically ignored when uploading rules
  - **Supabase Integration**: Rules stored in Supabase for production scalability (with SQLite fallback)
- 📊 **Comprehensive Analytics & Observability** – Full tenant-level analytics logging with Supabase backend (SQLite fallback for local dev):
  - Tool usage breakdown (RAG, Web, Admin, LLM) with latency and token tracking
  - RAG recall/precision indicators (average hits, scores, top scores)
  - Per-tenant query volume and active users
  - Red-flag violations with timestamps and confidence scores
  - LLM token logs and latency metrics
  - **Real-Time Visualizations**: Reasoning path visualizer, tool invocation timeline, and tenant activity heatmap
- 🌐 **Live Web Search** – Google Programmable Search (Custom Search API) with tenant-aware MCP tooling
- 🏢 **Multi-Tenant Isolation** – Complete tenant isolation with centralized tenant ID management; backend enforces strict isolation for chat, ingestion, and admin ops
- 🔐 **Fine-Grained Role-Based Access Control (RBAC)** – Four-tier role system (viewer, editor, admin, owner) with backend permission enforcement
- 🔄 **Intelligent Multi-Tool Orchestration** – MCP agent orchestrator autonomously selects optimal tool chains (RAG + Web + LLM, etc.) based on query intent, context, latency predictions, and previous tool outputs. Context-aware routing enables sophisticated tool skipping for efficiency
- ⚡ **Robust Error Handling** – Structured error responses, retry mechanisms, and graceful fallbacks (e.g., if RAG fails → fallback to LLM-only)
- 📡 **Streaming Responses** – Chat responses stream character-by-character using Server-Sent Events (SSE) for real-time user experience
- 🎯 **Rule-First Processing** – Admin rules checked before intent classification - rules can trigger brief responses or block requests entirely
- 🧠 **Advanced Context Engineering** – Implements Anthropic's context engineering strategies:
  - **High-Fidelity Compaction**: Automatically compresses conversations at 80% token threshold, preserving architectural decisions and unresolved issues
  - **Tool Result Clearing**: Safest form of compaction - removes large tool outputs while keeping metadata
  - **Structured Note-Taking**: Tracks objectives, architectural decisions, and unresolved issues outside context window
  - **XML-Structured Prompts**: All prompts use clear XML sections for better model understanding
  - **Just-in-Time Context Loading**: Selects only relevant memories and tools for each query
  - **Progressive Disclosure**: Agents discover context incrementally through exploration

### Enterprise Features

- 🔍 **Regex-Based Red-Flag Detection** – Support for complex regex patterns with keyword fallback and semantic scoring
- 🤖 **LLM-Enhanced Rule Management** – Rules automatically enhanced by LLM to identify edge cases, improve patterns, and suggest severity levels. Includes intelligent fallback explanations when LLM is unavailable - uses keyword extraction to generate useful explanations, examples, and pattern suggestions even during timeouts
- 📄 **File Upload & Drag-and-Drop** – Upload rules from files (TXT, PDF, DOC, DOCX) with intuitive drag-and-drop interface
- ⚡ **Chunk-Wise Processing** – Large rule sets processed in chunks to prevent timeouts and ensure reliable processing
- 📈 **Real-Time Analytics Dashboard** – Per-tenant analytics with configurable time windows (7, 30, 90 days)
- 🛠️ **Admin API Endpoints** – `/admin/violations`, `/admin/tools/logs`, `/admin/tenants` for comprehensive governance
- 🧠 **Agent Debug & Planning** – `/agent/debug` and `/agent/plan` endpoints for observability and tool selection inspection
- 💾 **Persistent Analytics Storage** – Supabase-backed analytics store (with automatic SQLite fallback) for fast, multi-tenant queries
- 🗄️ **Supabase Integration** – Production-ready Supabase support for admin rules with automatic table creation
- 📈 **Real-Time Visualization Components** – Interactive visualizations for agent reasoning, tool execution, and tenant activity:
  - **Reasoning Path Visualizer**: Step-by-step visualization of agent decision-making with animated progression
  - **Tool Invocation Timeline**: Visual timeline showing tool execution order, latency, and result counts
  - **Tenant Activity Heatmap**: Query activity heatmap and per-tool usage trends over time

### Conversation Memory System

IntegraChat includes a **short-term conversation memory** system that enhances multi-step workflows by maintaining context across tool calls:

- **Automatic Storage**: Every tool output is automatically stored in memory for the session
- **Bounded Size**: Keeps only the last N tool outputs (configurable via `MCP_MEMORY_MAX_ITEMS`, default: 10)
- **Auto-Expiration**: Entries automatically expire after a configurable TTL (via `MCP_MEMORY_TTL_SECONDS`, default: 900 seconds / 15 minutes)
- **Session-Based**: Memory is keyed by `session_id` (not `tenant_id`) for safety and isolation
- **Automatic Injection**: Recent memory is automatically injected into tool payloads as a `memory` field for multi-step workflows
- **Session Clearing**: Memory can be explicitly cleared by sending `end_session: true` or `endSession: true` in the payload

**Usage Example:**
```json
{
  "tenant_id": "acme",
  "session_id": "chat-abc-123",
  "query": "Search for X"
}
```

Subsequent tool calls with the same `session_id` will receive a `memory` field containing recent tool outputs, enabling tools to make context-aware decisions in multi-step workflows.

**Configuration:**
- `MCP_MEMORY_MAX_ITEMS`: Maximum number of tool outputs to keep per session (default: 10)
- `MCP_MEMORY_TTL_SECONDS`: Time-to-live for memory entries in seconds (default: 900)

---

## Role-Based Access Control (RBAC)

IntegraChat implements fine-grained role-based access control (RBAC) for backend API endpoints. This ensures that users can only access features appropriate for their role level.

### Roles

The system supports four roles with increasing privileges:

1. **viewer** (default) - Basic read-only access
   - Can use chat functionality
   - Cannot ingest documents
   - Cannot delete documents
   - Cannot view analytics
   - Cannot manage admin rules

2. **editor** - Content management access
   - Can use chat functionality
   - ✅ Can ingest documents (upload, paste, URLs, files)
   - ❌ Cannot delete documents
   - ❌ Cannot view analytics
   - ❌ Cannot manage admin rules

3. **admin** - Administrative access
   - Can use chat functionality
   - ✅ Can ingest documents
   - ✅ Can delete documents
   - ✅ Can view analytics
   - ✅ Can manage admin rules

4. **owner** - Full system access
   - Same permissions as admin (highest privilege level)

### Permission Matrix

| Action | viewer | editor | admin | owner |
|--------|--------|--------|-------|-------|
| Chat Bot | ✅ | ✅ | ✅ | ✅ |
| Ingest Documents | ❌ | ✅ | ✅ | ✅ |
| Delete Documents | ❌ | ❌ | ✅ | ✅ |
| View Analytics | ✅ | ✅ | ✅ | ✅ |
| Manage Rules | ❌ | ❌ | ✅ | ✅ |

### Backend RBAC

Backend API endpoints enforce RBAC through the `x-user-role` header:

```python
# Permission matrix in backend/mcp_server/common/access_control.py
PERMISSIONS = {
    "manage_rules": {"owner", "admin"},
    "ingest_documents": {"owner", "admin", "editor"},
    "delete_documents": {"owner", "admin"},
    "view_analytics": {"owner", "admin"},
}
```

**Protected Endpoints:**
- `/admin/rules` - Requires `admin` or `owner` role
- `/rag/ingest*` - Requires `editor`, `admin`, or `owner` role
- `/rag/delete*` - Requires `admin` or `owner` role
- `/analytics/*` - All roles can view (viewer, editor, admin, owner)

**Role Propagation:**
The user role is automatically propagated through the entire request pipeline:
1. Client sends `x-user-role` header
2. Backend API route receives and validates role
3. Role is passed to service layer (`process_ingestion()`, etc.)
4. Service layer passes role to MCP clients
5. MCP clients include role in payload to MCP server
6. MCP server extracts role and enforces permissions

**Example Request:**
```bash
curl -X POST "http://localhost:8000/admin/rules" \
  -H "Content-Type: application/json" \
  -H "x-tenant-id: tenant123" \
  -H "x-user-role: admin" \
  -d '{"rule": "Do not share passwords"}'
```

If the role lacks permission, the API returns `403 Forbidden` with a descriptive error message that includes:
- Which role was used
- Which roles are allowed for the action
- Instructions to change role in the UI

### Using RBAC

1. **Set Role**: Include `x-user-role` header in API requests with one of: `viewer`, `editor`, `admin`, or `owner`
2. **Verify Permissions**: Backend enforces role-based access automatically
3. **Error Handling**: API returns `403 Forbidden` with clear error messages when role lacks required permissions

---

## Real-Time Visualization Features

IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity:

### 1. Reasoning Path Visualizer
- **What it shows**: Step-by-step visualization of how the agent makes decisions
- **Features**:
  - Animated progression through reasoning steps
  - Status indicators (pending, running, completed, error)
  - Detailed metrics per step (latency, hit counts, token estimates)
  - Visual icons for each step type
- **Where to find it**: 
  - Gradio app: Debug & Reasoning tab
- **Data source**: `reasoning_trace` from agent responses

### 2. Tool Invocation Timeline
- **What it shows**: Visual timeline of all tool executions during an agent interaction
- **Features**:
  - Color-coded bars showing tool status (success/error)
  - Latency visualization per tool
  - Result count badges
  - Summary statistics (total tools, total time, average latency)
- **Where to find it**: 
  - Gradio app: Debug & Reasoning tab
- **Data source**: `tool_traces` from agent responses

### 3. Tenant Activity Heatmap
- **What it shows**: Query activity patterns and tool usage trends over time
- **Features**:
  - Hour-by-hour, day-by-day activity heatmap
  - Color intensity based on activity level
  - Per-tool usage trends with bar charts
  - Trend indicators (up/down/stable)
- **Where to find it**: 
  - Gradio app: Admin Analytics tab
  - Configurable time window (default: 7 days)
- **Data source**: `/analytics/activity` and `/analytics/tool-usage` endpoints

**Access**: All visualization features are available to all roles (viewer, editor, admin, owner).

---

## Installation & Setup

### Prerequisites

- **Python 3.10+** with pip
- **PostgreSQL** (with pgvector extension) or **Supabase** for RAG storage
- **Supabase** (recommended) or SQLite for admin rules and analytics
- **Ollama** (local) or **Groq API** credentials for LLM
- **Google Custom Search API** (optional, for web search):
  - Enable Custom Search API in [Google Cloud Console](https://console.cloud.google.com/)
  - Create API key → set as `GOOGLE_SEARCH_API_KEY` in `.env`
  - Create Programmable Search Engine → set ID as `GOOGLE_SEARCH_CX_ID` in `.env`

### Step-by-Step Installation

1. **Clone and navigate to the project**:
   ```bash
   cd IntegraChat
   ```

2. **Create and activate virtual environment** (recommended):
   ```bash
   # Windows
   python -m venv venv
   venv\Scripts\activate
   
   # Linux/Mac
   python3 -m venv venv
   source venv/bin/activate
   ```

3. **Install Python dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

4. **Configure environment variables**:
   ```bash
   cp env.example .env
   # Edit .env with your credentials:
   # - SUPABASE_URL and SUPABASE_SERVICE_KEY (for production storage)
   # - POSTGRESQL_URL (for RAG vector database)
   # - OLLAMA_URL/OLLAMA_MODEL or GROQ_API_KEY (for LLM)
   # - GOOGLE_SEARCH_API_KEY and GOOGLE_SEARCH_CX_ID (optional, for web search)
   ```

5. **Set up Supabase** (recommended for production):
   - Create a Supabase project at [supabase.com](https://supabase.com)
   - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor
   - Run `supabase_analytics_tables.sql` in Supabase SQL Editor
   - Copy your project URL and service role key to `.env`
   - Verify setup: `python verify_supabase_setup.py`

6. **Start the services**:

   **Option A: Windows Quick Start** (recommended for Windows):
   ```bash
   start.bat
   ```
   This automatically starts:
   - FastAPI backend on port 8000
   - Unified MCP server on port 8900

   **Option B: Manual Start**:
   ```bash
   # Terminal 1: FastAPI backend
   uvicorn backend.api.main:app --port 8000 --reload
   
   # Terminal 2: Unified MCP server
   python backend/mcp_server/server.py
   ```

7. **Launch the UI**:

   **Gradio Interface** (full-featured):
   ```bash
   python app.py
   ```
   Access at `http://localhost:7860`


## Usage

### Gradio Interface (`app.py`)

The Gradio UI provides a comprehensive interface with five main tabs:

#### 1. **Chat** 💬
- Enter your Tenant ID and start chatting with the MCP-powered agent
- Real-time streaming responses (word-by-word using SSE)
- Autonomous tool orchestration (RAG, Web, Admin, LLM)
- Multi-step planning with memory of previous tool outputs

#### 2. **Document Ingestion** 📚
- **Raw Text**: Paste text directly
- **URL**: Ingest content from web URLs
- **File Upload**: Upload PDF, DOCX, TXT, or Markdown files
- Rich metadata support (filename, URL, document ID, custom JSON)
- View and manage ingested documents

#### 3. **Knowledge Base Library** 📖
- **Statistics Dashboard**: Visual cards showing document counts by type
- **Interactive Charts**: Plotly pie chart for document type distribution
- **Semantic Search**: Search knowledge base with relevance scoring
- **Type Filtering**: Filter by document type (text, PDF, FAQ, link)
- **Document Management**: View, preview, and delete documents
- **Auto-refresh**: Lists update automatically after operations

#### 4. **Admin Analytics** 📊
- **Statistics Cards**: Total queries, active users, red flags, RAG searches
- **Interactive Bar Charts**: 
  - Tool Usage Count (RAG, Web, Admin, LLM)
  - Average Tool Latency (performance metrics)
  - RAG Quality Metrics (hits, scores, recall indicators)
- **Tool Usage Table**: Detailed performance breakdown
- **Formatted Summary**: Key metrics in easy-to-read format
- Click "🔄 Fetch Analytics Snapshot" to load latest data

#### 5. **Admin Rules & Compliance** 🛡️
- **Text Input**: Paste rules one per line (comments starting with # are ignored)
- **File Upload**: Upload rules from TXT, PDF, DOC, or DOCX files
- **LLM Enhancement**: Automatic rule enhancement (edge cases, pattern improvements, severity suggestions)
- **Chunk Processing**: Large rule sets processed in chunks (5 at a time)
- **Rule-Based Behavior**: Rules checked FIRST - brief responses or blocking based on severity
- **Streaming Responses**: Real-time word-by-word streaming
- **Refresh Button**: Update rules table directly

> **💡 Tip:** Every action requires a Tenant ID. The Tenant ID persists across page refreshes and is managed centrally.


---

## API Endpoints

All endpoints are served by the FastAPI backend at `http://localhost:8000`. Most endpoints require the `x-tenant-id` header for tenant isolation.

> **📖 API Documentation**: Interactive Swagger docs available at `http://localhost:8000/docs` when the backend is running.

### Agent Endpoints

| Method | Endpoint | Description |
| --- | --- | --- |
| `POST` | `/agent/message` | Main chat endpoint with `tenant_id`, `message`, optional history |
| `POST` | `/agent/message/stream` | Streaming chat endpoint using Server-Sent Events (SSE). Returns tokens word-by-word |
| `POST` | `/agent/debug` | Detailed debugging info: reasoning trace, tool selection, intent classification |
| `POST` | `/agent/plan` | Tool selection plan without execution (intent, tool scores, planned steps) |

### RAG Endpoints

| Method | Endpoint | Description |
| --- | --- | --- |
| `POST` | `/rag/ingest-document` | Ingest document with `source_type`, `content`, metadata. Supports raw text, URLs, PDFs, DOCX, TXT, Markdown |
| `POST` | `/rag/ingest-file` | Multipart file upload (PDF/DOCX/TXT/MD) with `x-tenant-id` header |
| `GET` | `/rag/list?tenant_id={id}&limit={n}&offset={n}` | List all documents for a tenant with pagination |
| `DELETE` | `/rag/delete/{document_id}?tenant_id={id}` | Delete a specific document by ID |
| `DELETE` | `/rag/delete-all?tenant_id={id}` | Delete all documents for a tenant |

**Note:** RAG endpoints support both `x-tenant-id` header and `tenant_id` query parameter.

### Admin & Governance Endpoints

| Method | Endpoint | Description |
| --- | --- | --- |
| `GET` | `/admin/rules?detailed=true` | Get all rules (use `detailed=true` for regex/severity metadata) |
| `POST` | `/admin/rules?enhance=true` | Add single rule with optional `pattern` (regex), `severity`, `description`. Set `enhance=true` for LLM enhancement |
| `POST` | `/admin/rules/bulk?enhance=true` | Add multiple rules at once (processed in chunks of 5). LLM enhancement applied automatically |
| `POST` | `/admin/rules/upload-file?enhance=true` | Upload rules from file (TXT, PDF, DOC, DOCX). Text extracted server-side |
| `DELETE` | `/admin/rules/{rule}` | Delete a specific rule |
| `GET` | `/admin/violations?days=30&limit=50` | Get red-flag violations with timestamps and confidence scores |
| `GET` | `/admin/tools/logs?tool_name=rag&days=7` | Get detailed tool usage logs with latency and token counts |
| `GET/POST/DELETE` | `/admin/tenants` | Tenant management endpoints |
| `POST` | `/admin/setup/table` | Create admin_rules table in Supabase if it doesn't exist |

### Analytics Endpoints

| Method | Endpoint | Description |
| --- | --- | --- |
| `GET` | `/analytics/overview?days=30` | Comprehensive analytics: total queries, tool usage, red-flag count, RAG quality |
| `GET` | `/analytics/tool-usage?days=30` | Detailed tool usage stats: counts, latency, tokens, success/error rates |
| `GET` | `/analytics/redflags?limit=50&days=30` | Recent red-flag violations for tenant |
| `GET` | `/analytics/activity?days=30` | Tenant activity summary: queries, active users, last query timestamp |
| `GET` | `/analytics/rag-quality?days=30` | RAG quality metrics: avg hits, scores, latency (recall/precision indicators) |

### Visualization Features

IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity:

#### 1. Real-Time Reasoning Visualizer
- **Location**: Debug tab (Gradio app)
- **Features**:
  - Step-by-step visualization of agent reasoning path
  - Animated progression through reasoning steps
  - Status indicators (pending, running, completed, error)
  - Detailed metrics per step (latency, hit counts, token estimates)
  - Visual icons for each step type (admin rules check, RAG prefetch, tool selection, etc.)
- **Data Source**: `reasoning_trace` from `/agent/message` or `/agent/debug` endpoints
- **Usage**: Automatically appears in chat panel when agent responses include reasoning traces

#### 2. Tool Invocation Timeline
- **Location**: Debug tab (Gradio app)
- **Features**:
  - Visual timeline showing tool execution order
  - Color-coded bars indicating tool status (success/error)
  - Latency visualization per tool
  - Result count badges
  - Summary statistics (total tools, total time, average latency)
- **Data Source**: `tool_traces` from `/agent/message` or `/agent/debug` endpoints
- **Usage**: Automatically appears in chat panel when agent responses include tool traces

#### 3. Live Tenant Heatmap
- **Location**: Analytics page (`/analytics`)
- **Features**:
  - Query activity heatmap (hour-by-hour, day-by-day visualization)
  - Color intensity based on activity level
  - Per-tool usage trends with bar charts
  - Trend indicators (up/down/stable)
  - Configurable time window (default: 7 days)
- **Data Source**: `/analytics/activity` and `/analytics/tool-usage` endpoints
- **Usage**: Navigate to Analytics page to view tenant activity patterns

**Access**: All visualization features are available to all roles (viewer, editor, admin, owner).

### Request Headers

Most endpoints require:
- `x-tenant-id`: Tenant identifier for multi-tenant isolation
- `x-user-role`: Caller role for RBAC enforcement (`viewer`, `editor`, `admin`, or `owner`)
  - **Important**: Role must be passed through the entire pipeline (UI → API → RAG Client → MCP Server)
  - Role is automatically propagated from the API request to backend API, then to RAG client, and finally to MCP server for permission checks
  - If ingestion fails with permission errors, verify the role is set correctly in the UI and check backend logs for role propagation debug messages
- `Content-Type: application/json`: For POST requests with JSON payloads

### Example Request

```bash
curl -X POST http://localhost:8000/agent/message \
  -H "Content-Type: application/json" \
  -H "x-tenant-id: tenant123" \
  -d '{
    "message": "What is our refund policy?",
    "tenant_id": "tenant123"
  }'
```

---

## Architecture

### System Overview

IntegraChat follows a modular architecture with clear separation of concerns:

```
┌─────────────────┐
│   Frontend UI   │  (Gradio)
│    Port 7860    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  FastAPI Backend│  (API Gateway)
│    Port 8000    │
└────────┬────────┘
         │
         ├──► Unified MCP Server (Port 8900)
         │    ├── RAG Tools (search, ingest, list, delete)
         │    ├── Web Tools (search)
         │    └── Admin Tools (rules, violations)
         │
         ├──► PostgreSQL/Supabase (RAG Vector Store)
         ├──► Supabase/SQLite (Rules & Analytics)
         └──► LLM Backend (Ollama/Groq)
```

### Enterprise-Grade Features

1. **Autonomous Multi-Step Planning**: LLM-powered planning determines optimal tool sequences with short-term conversation memory that stores and injects previous tool outputs into subsequent tool calls for better context awareness.

2. **Regex-Based Governance**: Admin rules support regex patterns with fallback to keyword matching and semantic similarity scoring for flexible policy enforcement.

3. **Comprehensive Analytics**: All tool usage, RAG searches, LLM calls, and red-flag violations are logged with indexed queries for fast analytics retrieval.

4. **Enhanced RAG Pipeline**: Documents chunked optimally (400-600 tokens) and enriched with metadata (source URL, timestamp, document type) for better retrieval.

5. **Structured Error Handling**: All errors logged with context, with graceful fallbacks (e.g., RAG fails → LLM-only, web fails → skip web).

### Data Storage Architecture

IntegraChat uses **dual-backend storage** with automatic fallback for production flexibility:

#### Supabase (Production/Preferred)

**When to use:** Production deployments, multi-user environments, scalable applications

**Storage:**
- `admin_rules` - Admin rules with regex patterns and severity levels
- `tool_usage_events` - Tool invocation logs with latency and token tracking
- `redflag_violations` - Red-flag violation events with timestamps
- `rag_search_events` - RAG search metrics and quality indicators
- `agent_query_events` - Agent query logs and analytics

**Features:**
- Row Level Security (RLS) for multi-tenant isolation
- Automatic backups and scaling
- Real-time capabilities
- Production-ready infrastructure

**Setup:** Configure `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` in `.env`

#### SQLite (Development Fallback)

**When to use:** Local development, testing, single-user scenarios

**Storage:**
- `data/admin_rules.db` - Admin rules (local file)
- `data/analytics.db` - Analytics events (local file)

**Features:**
- Zero configuration required
- Perfect for local development
- Automatic fallback when Supabase not configured

**Migration:** To migrate existing SQLite data to Supabase, refer to Supabase documentation for data migration strategies.

---

## Supabase Setup & Migration

IntegraChat supports Supabase for production-ready storage of admin rules and analytics. Both `RulesStore` and `AnalyticsStore` automatically detect and use Supabase when credentials are available, falling back to SQLite for local development.

### Quick Setup

1. **Create Supabase tables**:
   - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor
   - Run `supabase_analytics_tables.sql` in Supabase SQL Editor

2. **Configure environment variables** in `.env`:
   ```env
   SUPABASE_URL=https://your-project-id.supabase.co
   SUPABASE_SERVICE_KEY=your_service_role_key_here
   ```

3. **Verify setup**: Check that your Supabase project is accessible and tables are created correctly.

---

## Troubleshooting

### Common Issues

#### Backend Not Starting
- **Issue**: FastAPI backend fails to start
- **Solution**: 
  - Check if port 8000 is already in use: `netstat -ano | findstr :8000` (Windows) or `lsof -i :8000` (Linux/Mac)
  - Verify Python virtual environment is activated
  - Check `.env` file exists and has required variables
  - Review error logs for missing dependencies

#### MCP Server Connection Errors
- **Issue**: "Could not connect to MCP server" errors
- **Solution**:
  - Ensure unified MCP server is running: `python backend/mcp_server/server.py`
  - Check MCP server is on port 8900 (default)
  - Verify `MCP_SERVER_ID` in `.env` matches server configuration
  - Check firewall settings if running on different machines

#### RAG Search Not Returning Results
- **Issue**: RAG searches return no results despite ingested documents
- **Solution**:
  - Check similarity threshold (default 0.3) - try lowering to 0.2 or 0.1
  - Verify documents exist: `GET /rag/list?tenant_id={id}`
  - Ensure tenant_id matches between ingestion and search
  - Check PostgreSQL/pgvector connection and vector extension
  - Review MCP server logs for search metrics

#### Supabase Configuration Issues
- **Issue**: Data still going to SQLite instead of Supabase
- **Solution**:
  - Verify `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` in `.env` (no quotes, no spaces)
  - Use **service_role** key (not anon key) from Supabase Dashboard
  - Verify Supabase credentials in `.env` file
  - Ensure tables exist: run SQL scripts in Supabase SQL Editor
  - Check FastAPI startup logs for backend detection messages

#### LLM Connection Errors
- **Issue**: Agent responses fail with LLM errors
- **Solution**:
  - For Ollama: Ensure Ollama is running (`ollama serve`)
  - Check `OLLAMA_URL` and `OLLAMA_MODEL` in `.env`
  - For Groq: Verify `GROQ_API_KEY` is set correctly
  - Check `LLM_BACKEND` setting (ollama or groq)
  - Test LLM connection: `curl http://localhost:11434/api/tags` (Ollama)

#### Document Ingestion Failures
- **Issue**: File uploads or document ingestion fails
- **Solution**:
  - Check file size limits (default may be 10MB)
  - Verify file format is supported (PDF, DOCX, TXT, MD)
  - Ensure tenant_id is provided in request
  - **Check user role**: Ingestion requires `editor`, `admin`, or `owner` role. If you see "Permission Denied (403)", change your role in the UI dropdown (top right) from "viewer" to "editor", "admin", or "owner"
  - Verify `x-user-role` header is being sent correctly (check backend logs for debug messages)
  - Check backend logs for specific error messages
  - Verify PostgreSQL connection for RAG storage

#### Document Display Issues
- **Issue**: Document list shows `[object Object]` instead of document details
- **Solution**: This has been fixed. Documents now display properly with:
  - Document ID (number)
  - Document Type (text, pdf, faq, link)
  - Preview (first 200 characters)
  - Length (character count)
  - Created date
- **If still seeing issues**: Refresh the Knowledge Base Library tab

#### Rule Addition Timeouts
- **Issue**: "Chunk 1/1 timed out after 45s" when adding rules
- **Solution**:
  - **Quick Fix**: Uncheck the "Enable LLM Enhancement" checkbox before adding rules - rules will be added immediately without LLM processing
  - **With Enhancement**: Keep checkbox checked but be patient - enhancement can take up to 180s for 5 rules (30s per rule)
  - **Best Practice**: Add rules in smaller batches (1-3 rules at a time) when using enhancement
- **Note**: Enhancement is optional - you can always add rules quickly without it, then enhance them later if needed

#### Rule Deletion Issues
- **Issue**: "404 Not Found" when trying to delete a rule
- **Solution**: You can now delete rules in two ways:
  - **By Number**: Enter the rule number (e.g., "1", "2", "3") as shown in the rules table
  - **By Text**: Enter the exact rule text as displayed in the rules table
- **If rule not found**: Make sure you're entering the exact text or a valid rule number. Refresh the rules table to see current rules.

#### Tenant Isolation Issues
- **Issue**: Documents or data leaking between tenants
- **Solution**:
  - Check database queries include `WHERE tenant_id = ...` filters
  - Verify tenant ID normalization is working correctly
  - Review database logs for tenant isolation

### Getting Help

1. **Check Logs**: Review FastAPI and MCP server logs for detailed error messages
2. **Run Diagnostics**: Use helper scripts in the Testing & Diagnostics section
3. **Verify Configuration**: Check `.env` file and Supabase connection
4. **Review Documentation**: See `backend/README.md` for backend-specific issues

---

## Testing & Diagnostics

You can test the system by:

- **API Testing**: Use the FastAPI interactive docs at `http://localhost:8000/docs` to test endpoints
- **Database Inspection**: Connect directly to your PostgreSQL/Supabase instance to verify tenant isolation
- **Log Monitoring**: Check FastAPI and MCP server logs for detailed error messages and debugging information

> **Tip:** Ensure the Python virtual environment is active (`source venv/bin/activate` or `.\venv\Scripts\activate`) and that `.env` contains the MCP server URLs/LLM settings.

---

## Demo Video  
  - ✅ **Prerequisites:** FastAPI backend plus all MCP servers (RAG/Web/Admin) running locally.  
  - ✅ **What it checks:**  
    1. Direct database writes via the analytics and rules stores  
    2. CRUD over the `/admin/*` and `/analytics/*` endpoints  
    3. RAG ingestion and isolation by issuing queries as multiple tenants and ensuring secrets never leak across IDs  
  - ✅ **Pass criteria:** At least 80 % of the sub-tests succeed (the RAG isolation test must pass for overall success).

- `python check_rag_database.py`  
  Provides a low-level inspection of the RAG datastore. It connects straight to the pgvector/Postgres instance, lists all tenant IDs, prints sample chunks, and runs `search_vectors()` directly to ensure the SQL `WHERE tenant_id = …` filter is behaving as expected. Use this script when diagnosing suspected cross-tenant leakage or when seeding demo data.

- `python verify_supabase_setup.py`  
  Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using. Displays any missing configuration and provides a summary of where data will be saved.

- `python check_supabase_rules.py`  
  Checks Supabase admin rules configuration and RLS policies. Validates that rules can be read/written correctly.

- `python migrate_sqlite_to_supabase.py`  
  One-shot migration script that copies existing SQLite data (admin rules + analytics) to Supabase. Supports both PostgreSQL direct connection and Supabase REST API methods.

- `python test_manual.py`  
  The existing manual test runner remains useful for smoke-testing analytics logging, admin rule CRUD, and API response codes. Run it whenever you adjust schemas or update MCP endpoints.

> **Tip:** Ensure the Python virtual environment is active (`source venv/bin/activate` or `.\venv\Scripts\activate`) and that `.env` contains the MCP server URLs/LLM settings.

---

## Demo Video

🎥 **[Demo Video Placeholder]** - Coming soon!

Watch how IntegraChat uses MCP to power autonomous agents with multi-tool selection, RAG retrieval, and enterprise governance.

---

## Social Media

📱 **[Social Media Post Placeholder]** - Coming soon!

Follow us for updates and demos of IntegraChat in action!

---

## Team Member(s)

- **Your Name Here** - Developer & MCP Enthusiast

---

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## Technical Stack

### Backend
- **Framework**: FastAPI with async/await for high-performance MCP orchestration
- **MCP Server**: Unified MCP server (port 8900) exposing all tools via namespaces
- **API**: RESTful API with Server-Sent Events (SSE) for streaming responses
- **LLM Integration**: 
  - Ollama (local, default) - `http://localhost:11434`
  - Groq (cloud) - via API key
  - Configurable backend with streaming support

### Frontend
- **Gradio UI**: Full-featured interface with Plotly visualizations (`app.py`)
- **UI Libraries**: 
  - Plotly for interactive charts and visualizations

### Data Storage
- **RAG Vector Store**: PostgreSQL with pgvector extension (via Supabase or direct connection)
- **Analytics**: Supabase (production) or SQLite (development) with indexed queries
- **Rules Storage**: Supabase (production) or SQLite (development) with automatic fallback
- **Database**: PostgreSQL for RAG embeddings, Supabase/SQLite for analytics and rules

### File Processing
- **Supported Formats**: TXT, PDF, DOC, DOCX, Markdown
- **Libraries**: PyPDF2, python-docx for server-side text extraction
- **Metadata**: Rich metadata support (source URL, timestamp, document type)

### Communication
- **Streaming**: Server-Sent Events (SSE) for real-time word-by-word response streaming
- **Protocol**: Model Context Protocol (MCP) for tool communication
- **HTTP**: RESTful endpoints with JSON payloads

## Recent Enhancements

### UI & UX Improvements (Latest)
- **Document Display Fix**: Fixed document list showing `[object Object]` - now properly displays document ID, type, preview, length, and creation date in a formatted table
- **Rule Deletion Enhancement**: Can now delete rules by entering either:
  - Rule number (e.g., "1", "2", "3") - automatically finds the corresponding rule
  - Full rule text - deletes the exact matching rule
- **LLM Enhancement Toggle**: Added checkbox to enable/disable LLM enhancement when adding rules:
  - **Quick Add**: Uncheck to add rules immediately without LLM processing (no timeout issues)
  - **Enhanced Add**: Check to get better patterns, explanations, and examples (takes longer but higher quality)
- **Improved Timeouts**: Increased timeout for rule enhancement from 45s to 180s to handle multiple rules properly
- **Better Error Messages**: Clearer error messages for rule deletion, document operations, and permission errors

### Role Propagation & Permission Handling (Latest)
- **Fixed Role Propagation**: User role (`viewer`, `editor`, `admin`, `owner`) is now properly passed through the entire ingestion pipeline:
  - UI sends role in `x-user-role` header
  - Backend API route receives and validates role
  - Role is passed to `process_ingestion()` service
  - RAG client includes role in payload to MCP server
  - MCP server uses role for permission checks
- **Improved Error Handling**: Permission errors (403 Forbidden) now return clear, actionable error messages:
  - Clear indication when role lacks required permissions
  - Guidance on which roles can perform specific actions
  - Instructions to change role in UI dropdown
- **Debug Logging**: Added comprehensive debug logging to trace role values through the pipeline for troubleshooting
- **Admin Question Handling**: Fixed "who is the admin" type questions to use RAG from knowledge base instead of generic LLM responses

### Admin Rules System (Latest)
- **File Upload Support**: Upload rules from TXT, PDF, DOC, DOCX files with drag-and-drop interface
- **LLM Enhancement Toggle**: Optional LLM enhancement with checkbox control:
  - **Quick Add Mode**: Uncheck to add rules immediately without LLM processing (no timeouts)
  - **Enhanced Mode**: Check to get better patterns, explanations, examples, and edge case detection
- **LLM Enhancement**: When enabled, automatic rule enhancement identifies edge cases, improves regex patterns, and suggests severity levels
- **Intelligent Fallback Explanations**: When LLM enhancement times out or fails, the system automatically generates basic explanations using keyword extraction, providing useful examples and pattern suggestions without requiring LLM availability
- **Chunk Processing**: Large rule sets processed in chunks of 5 to prevent timeouts (handles 100+ rules efficiently)
- **Enhanced Timeouts**: Increased timeout from 45s to 180s per chunk to accommodate LLM processing
- **Flexible Rule Deletion**: Delete rules by entering either rule number (e.g., "1") or full rule text
- **Comment Filtering**: Comment lines (starting with #) automatically ignored when uploading rules
- **Rule-First Processing**: Admin rules checked before intent classification - enables behavior control (brief responses vs blocking)
- **Supabase Integration**: Production-ready Supabase support with automatic table creation
- **Streaming Responses**: Word-by-word streaming for chat responses using Server-Sent Events (SSE)

### Conversation Memory System (Latest)
- **Short-Term Memory**: Automatic storage of tool outputs per session with configurable size limits and TTL
- **Session-Based Isolation**: Memory keyed by session_id (not tenant_id) for safety
- **Automatic Injection**: Recent memory automatically injected into tool payloads for multi-step workflows
- **Auto-Expiration**: Memory entries expire after configurable TTL (default: 15 minutes)
- **Session Management**: Memory can be explicitly cleared via `end_session` flag
- **Comprehensive Testing**: Full test suite covering memory storage, retrieval, expiration, and multi-step workflows

### AI-Generated KB Metadata & Advanced RAG (Latest)
- **Automatic Metadata Extraction**: When ingesting documents, system auto-extracts:
  - **Title**: From filename, URL, or content structure (with intelligent fallback)
  - **Summary**: 2-3 sentence summary via LLM (with keyword-based fallback)
  - **Tags**: 5-8 relevant tags extracted from content
  - **Topics**: 3-5 main themes identified via LLM
  - **Date Detection**: Multiple date formats automatically detected
  - **Quality Score**: 0.0-1.0 score based on structure and completeness
- **Intelligent Fallback**: When LLM is unavailable or times out, uses keyword extraction and pattern matching to provide useful metadata
- **Database Integration**: Metadata stored in JSONB column for flexible querying and enhanced RAG search
- **Migration Script**: Safe, idempotent database migration script included

### Per-Tool Latency Prediction & Context-Aware Routing (Latest)
- **Latency Prediction**: Agent estimates expected latency before tool selection:
  - RAG: 60-120ms (depends on result count)
  - Web: 400-1800ms (network-dependent)
  - Admin: <20ms (local regex matching)
  - LLM: Variable based on model and token count
- **Path Optimization**: Agent chooses fastest tool sequence based on latency estimates
- **Context-Aware Routing**: Intelligent tool skipping based on previous outputs:
  - High RAG score (≥0.8) → Skip web search
  - Critical admin violation → Skip agent reasoning, immediate block
  - Relevant memory available → Skip RAG, use memory instead
- **Routing Hints**: Context hints included in reasoning trace for transparency
- **Performance Impact**: Leads to more sophisticated behavior and higher scores

### Tool Output Schemas (Latest)
- **Strict JSON Schemas**: Every tool returns validated JSON with consistent structure:
  - **RAG**: `{results: [...], top_score: float, latency_ms: int}`
  - **Web**: `{results: [...], latency_ms: int}`
  - **Admin**: `{violations: [...], severity: str, latency_ms: int}`
  - **LLM**: `{text: str, tokens_used: int, latency_ms: int}`
- **Automatic Validation**: All tool outputs validated and formatted before use
- **Easier Debugging**: Consistent structure makes debugging and monitoring simpler
- **Polished Responses**: Schema-validated outputs ensure professional appearance

### Cross-Encoder Re-ranking (Latest)
- **Two-Stage RAG Process**: 
  - Initial vector search retrieves candidates
  - Cross-encoder re-ranks top 10 results for accuracy
  - Final filtering by threshold and limit
- **Model**: Uses `cross-encoder/ms-marco-MiniLM-L-6-v2` (very fast, production-ready)
- **Massive Accuracy Improvement**: Re-ranking significantly improves relevance of search results
- **Seamless Integration**: Works transparently with existing RAG search API

### Context Engineering (Latest)
- **Anthropic-Inspired Strategies**: Implements best practices from Anthropic's context engineering research:
  - **Compaction**: High-fidelity summarization preserving architectural decisions, unresolved issues, and implementation details
  - **Tool Result Clearing**: Safest form of compaction - removes large tool outputs once processed
  - **Structured Note-Taking**: Tracks objectives (like Claude playing Pokémon), architectural decisions, and unresolved issues
  - **XML-Structured Prompts**: All prompts use clear XML sections (`<system>`, `<background_information>`, `<instructions>`) for better model understanding
  - **Automatic Compression**: Conversations compressed at 80% token threshold, targeting 60% after compression
  - **Just-in-Time Context**: Selects only relevant memories and tools for each query
  - **Progressive Disclosure**: Agents discover context incrementally through exploration
- **Benefits**: 
  - Reduced token usage and costs
  - Longer conversation support
  - Better agent coherence across extended interactions
  - Improved performance through structured context
- **Documentation**: Context engineering features are integrated throughout the agent orchestrator and MCP server

### UI Improvements
- **Modern Drag-and-Drop**: Intuitive file upload with visual feedback
- **Enhanced Status Messages**: Clear success/error messages with icons
- **Refresh Button in Table**: Quick refresh directly from the Rule Set section
- **Better Visual Hierarchy**: Improved spacing, colors, and layout
- **Gradio UI Enhancements**: 
  - AI metadata displayed after document ingestion
  - Latency predictions shown in reasoning trace
  - Context-aware routing hints visualized
  - Tool output schemas displayed in debug view

## Key Technical Features

### Tenant Isolation & Normalization
- **Strict tenant isolation** enforced at database level with `WHERE tenant_id = ...` filters
- **Automatic tenant ID normalization** handles whitespace and formatting differences
- Documents can be listed and deleted consistently across different tenant_id formats
- All operations validate tenant ownership before execution

### RAG Search & Retrieval
- **Cross-Encoder Re-ranking**: Two-stage retrieval process for massive accuracy improvement:
  - First: Vector search retrieves top candidates using embeddings
  - Then: Cross-encoder model (`cross-encoder/ms-marco-MiniLM-L-6-v2`) re-ranks top 10 results
  - Final: Results filtered by threshold and limit applied
- **Optimized similarity threshold** (default 0.3) for better recall of relevant documents
- **Intelligent fallback** returns top result even if below threshold to ensure knowledge base content is accessible
- **Pattern-based tool selection** automatically triggers RAG for admin questions, fact lookups, and internal knowledge queries
- **Response unwrapping** ensures seamless integration between MCP server and orchestrator

### MCP Server Architecture
- **Unified server** running on a single port (default 8900) for all namespaced tools
- **Dual protocol support**: Both MCP protocol (POST with JSON) and RESTful HTTP (GET/DELETE)
- **Response wrapping**: Standardized response format with automatic unwrapping in clients
- **Error handling**: Comprehensive error responses with detailed messages for debugging

## UI Features

### Knowledge Base Library
- **Visual Statistics**: Real-time document counts and type distribution
- **Interactive Charts**: Plotly pie charts for document type visualization
- **Advanced Search**: Semantic search across all ingested documents with relevance scoring
- **Smart Filtering**: Filter by document type (text, PDF, FAQ, link)
- **Bulk Operations**: Delete individual documents or all documents at once
- **Auto-refresh**: Lists automatically update after operations

### Admin Analytics Dashboard
- **Statistics Cards**: Key metrics displayed in visually appealing cards with icons
- **Tool Usage Visualization**: Bar charts showing tool invocation counts and performance
- **Latency Metrics**: Visual representation of tool response times
- **RAG Quality Analysis**: Charts displaying search quality metrics (hits, scores, recall)
- **Detailed Tables**: Comprehensive tool usage breakdown with success/error rates
- **Dark Theme**: Modern UI with dark background and white text for better readability
- **Real-time Updates**: Fetch latest analytics data with a single click

## Acknowledgments

- Built with [Model Context Protocol (MCP)](https://modelcontextprotocol.io/)
- Powered by [Gradio](https://gradio.app/) for the interface
- Visualizations created with [Plotly](https://plotly.com/python/)
- Backend built with [FastAPI](https://fastapi.tiangolo.com/)
- Analytics and governance features inspired by enterprise AI platform requirements

---

<div align="center">

**Made with ❤️ for the MCP Hackathon**

**IntegraChat: Enterprise-Grade MCP Autonomous Agent Platform**

[⬆ Back to Top](#integrachat--enterprise-mcp-autonomous-agent-platform)

</div>