IntegraChat / README.md
nothingworry's picture
Fix Gradio compatibility: downgrade to 4.20.0 and fix HF Spaces deployment
cb54b4d
---
title: IntegraChat
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.20.0"
app_file: app.py
pinned: false
---
# IntegraChat — Enterprise MCP Autonomous Agent Platform
**Track:** MCP in Action
**Category:** Enterprise
**Tag:** `mcp-in-action-track-enterprise`
---
## 📋 Table of Contents
- [Overview](#overview)
- [Quick Start](#quick-start)
- [Features](#features)
- [Conversation Memory System](#conversation-memory-system)
- [Role-Based Access Control (RBAC)](#role-based-access-control-rbac)
- [Installation & Setup](#installation--setup)
- [Usage](#usage)
- [API Endpoints](#api-endpoints)
- [Architecture](#architecture)
- [Supabase Setup & Migration](#supabase-setup--migration)
- [Troubleshooting](#troubleshooting)
- [Testing & Diagnostics](#testing--diagnostics)
- [Technical Stack](#technical-stack)
- [License](#license)
---
## Overview
**IntegraChat** is an enterprise-grade, multi-tenant AI platform that demonstrates the full capabilities of the **Model Context Protocol (MCP)** in a production-style environment. Built with enterprise governance and observability in mind, IntegraChat combines autonomous tool-using agents, RAG retrieval, live web search, and admin compliance under strict tenant isolation.
This platform showcases how MCP can power intelligent, governed, multi-tenant AI systems with real-time analytics, regex-based red-flag detection, and comprehensive tool orchestration.
---
## 🚀 Quick Start
### Windows Users
```bash
# 1. Install dependencies
pip install -r requirements.txt
# 2. Configure environment (copy and edit .env)
cp env.example .env
# Edit .env with your credentials (Supabase, LLM, etc.)
# 3. Start all services
start.bat
```
### Manual Setup
```bash
# 1. Install dependencies
pip install -r requirements.txt
# 2. Configure environment
cp env.example .env
# Edit .env with your credentials
# 3. Start FastAPI backend (Terminal 1)
uvicorn backend.api.main:app --port 8000 --reload
# 4. Start unified MCP server (Terminal 2)
python backend/mcp_server/server.py
# 5. Start Gradio UI (Terminal 3)
python app.py
```
Then access:
- **Gradio UI**: `http://localhost:7860`
- **FastAPI Docs**: `http://localhost:8000/docs`
> **Security Note:** REST requests that hit protected endpoints must include both `x-tenant-id` and `x-user-role` headers. Roles (`viewer`, `editor`, `admin`, `owner`) determine which actions—such as document ingestion, rule uploads, or analytics access—the caller may perform.
---
## Features
### Core Capabilities
- 🤖 **Autonomous Multi-Step MCP Agents** – Intelligent tool-aware agent that plans and executes multi-step workflows across RAG, Web, Admin, and LLM tools with short-term conversation memory
- 💭 **Short-Term Conversation Memory** – Automatic memory system that stores the last N tool outputs per session with configurable expiration (default: 10 outputs, 15 minutes TTL). Memory is keyed by session_id (not tenant_id) for safety, enabling better context awareness in multi-step workflows. Memory is automatically injected into tool payloads and cleared on session end.
- 📚 **Enhanced Knowledge Base Management** – Upload raw text, URLs, or documents (PDF/DOCX/TXT/MD) with rich metadata (source URL, timestamp, document type) and optimized chunking (400-600 tokens)
- 🤖 **AI-Generated KB Metadata** – Automatic extraction of title, summary, tags, topics, date, and quality score during document ingestion. LLM-powered with intelligent fallback when unavailable - uses keyword extraction and pattern matching to provide useful metadata even during timeouts
- 🔍 **Optimized RAG Search with Cross-Encoder Re-ranking** – Two-stage retrieval: initial vector search followed by cross-encoder re-ranking of top candidates using `cross-encoder/ms-marco-MiniLM-L-6-v2` for massive accuracy improvement. Semantic search with configurable similarity threshold (default 0.3) for better recall
-**Per-Tool Latency Prediction** – Agent estimates expected latency before choosing tools (RAG: 60-120ms, Web: 400-1800ms, Admin: <20ms) to optimize tool selection and choose the fastest path
- 🧠 **Context-Aware MCP Routing** – Intelligent tool selection based on previous outputs: skip web search if RAG returns high score (≥0.8), skip agent reasoning for critical admin violations, skip RAG if relevant memory already available. Leads to more sophisticated behavior and higher scores
- 📋 **Tool Output Schemas** – Every tool returns strict JSON type schemas for easier debugging, cleaner reasoning, and more polished responses. Automatic schema validation and formatting
- 🗑️ **Document Management** – Delete individual documents or bulk delete all documents for a tenant with confirmation dialogs
- 🛡️ **Enterprise Admin Governance** – Advanced rule management system with:
- Regex-based red-flag pattern matching with severity levels (low/medium/high/critical)
- Automatic admin alerts for violations
- **LLM-Enhanced Rules**: Rules are automatically analyzed and enhanced to identify edge cases, improve regex patterns, and suggest appropriate severity levels
- **LLM-Guided Rule Explanations**: Automatic generation of human-readable explanations, concrete examples, and missing pattern suggestions. Includes intelligent fallback when LLM is unavailable - uses keyword extraction to provide useful explanations even during timeouts
- **File Upload Support**: Upload rules from TXT, PDF, DOC, or DOCX files with drag-and-drop interface
- **Chunk Processing**: Large rule sets processed in manageable chunks (5 rules at a time) to prevent timeouts
- **Rule-Based Behavior Control**: Rules checked FIRST - brief response rules return quick answers, blocking rules prevent requests
- **Comment Filtering**: Comment lines (starting with #) automatically ignored when uploading rules
- **Supabase Integration**: Rules stored in Supabase for production scalability (with SQLite fallback)
- 📊 **Comprehensive Analytics & Observability** – Full tenant-level analytics logging with Supabase backend (SQLite fallback for local dev):
- Tool usage breakdown (RAG, Web, Admin, LLM) with latency and token tracking
- RAG recall/precision indicators (average hits, scores, top scores)
- Per-tenant query volume and active users
- Red-flag violations with timestamps and confidence scores
- LLM token logs and latency metrics
- **Real-Time Visualizations**: Reasoning path visualizer, tool invocation timeline, and tenant activity heatmap
- 🌐 **Live Web Search** – Google Programmable Search (Custom Search API) with tenant-aware MCP tooling
- 🏢 **Multi-Tenant Isolation** – Complete tenant isolation with centralized tenant ID management; backend enforces strict isolation for chat, ingestion, and admin ops
- 🔐 **Fine-Grained Role-Based Access Control (RBAC)** – Four-tier role system (viewer, editor, admin, owner) with backend permission enforcement
- 🔄 **Intelligent Multi-Tool Orchestration** – MCP agent orchestrator autonomously selects optimal tool chains (RAG + Web + LLM, etc.) based on query intent, context, latency predictions, and previous tool outputs. Context-aware routing enables sophisticated tool skipping for efficiency
-**Robust Error Handling** – Structured error responses, retry mechanisms, and graceful fallbacks (e.g., if RAG fails → fallback to LLM-only)
- 📡 **Streaming Responses** – Chat responses stream character-by-character using Server-Sent Events (SSE) for real-time user experience
- 🎯 **Rule-First Processing** – Admin rules checked before intent classification - rules can trigger brief responses or block requests entirely
- 🧠 **Advanced Context Engineering** – Implements Anthropic's context engineering strategies:
- **High-Fidelity Compaction**: Automatically compresses conversations at 80% token threshold, preserving architectural decisions and unresolved issues
- **Tool Result Clearing**: Safest form of compaction - removes large tool outputs while keeping metadata
- **Structured Note-Taking**: Tracks objectives, architectural decisions, and unresolved issues outside context window
- **XML-Structured Prompts**: All prompts use clear XML sections for better model understanding
- **Just-in-Time Context Loading**: Selects only relevant memories and tools for each query
- **Progressive Disclosure**: Agents discover context incrementally through exploration
### Enterprise Features
- 🔍 **Regex-Based Red-Flag Detection** – Support for complex regex patterns with keyword fallback and semantic scoring
- 🤖 **LLM-Enhanced Rule Management** – Rules automatically enhanced by LLM to identify edge cases, improve patterns, and suggest severity levels. Includes intelligent fallback explanations when LLM is unavailable - uses keyword extraction to generate useful explanations, examples, and pattern suggestions even during timeouts
- 📄 **File Upload & Drag-and-Drop** – Upload rules from files (TXT, PDF, DOC, DOCX) with intuitive drag-and-drop interface
-**Chunk-Wise Processing** – Large rule sets processed in chunks to prevent timeouts and ensure reliable processing
- 📈 **Real-Time Analytics Dashboard** – Per-tenant analytics with configurable time windows (7, 30, 90 days)
- 🛠️ **Admin API Endpoints**`/admin/violations`, `/admin/tools/logs`, `/admin/tenants` for comprehensive governance
- 🧠 **Agent Debug & Planning**`/agent/debug` and `/agent/plan` endpoints for observability and tool selection inspection
- 💾 **Persistent Analytics Storage** – Supabase-backed analytics store (with automatic SQLite fallback) for fast, multi-tenant queries
- 🗄️ **Supabase Integration** – Production-ready Supabase support for admin rules with automatic table creation
- 📈 **Real-Time Visualization Components** – Interactive visualizations for agent reasoning, tool execution, and tenant activity:
- **Reasoning Path Visualizer**: Step-by-step visualization of agent decision-making with animated progression
- **Tool Invocation Timeline**: Visual timeline showing tool execution order, latency, and result counts
- **Tenant Activity Heatmap**: Query activity heatmap and per-tool usage trends over time
### Conversation Memory System
IntegraChat includes a **short-term conversation memory** system that enhances multi-step workflows by maintaining context across tool calls:
- **Automatic Storage**: Every tool output is automatically stored in memory for the session
- **Bounded Size**: Keeps only the last N tool outputs (configurable via `MCP_MEMORY_MAX_ITEMS`, default: 10)
- **Auto-Expiration**: Entries automatically expire after a configurable TTL (via `MCP_MEMORY_TTL_SECONDS`, default: 900 seconds / 15 minutes)
- **Session-Based**: Memory is keyed by `session_id` (not `tenant_id`) for safety and isolation
- **Automatic Injection**: Recent memory is automatically injected into tool payloads as a `memory` field for multi-step workflows
- **Session Clearing**: Memory can be explicitly cleared by sending `end_session: true` or `endSession: true` in the payload
**Usage Example:**
```json
{
"tenant_id": "acme",
"session_id": "chat-abc-123",
"query": "Search for X"
}
```
Subsequent tool calls with the same `session_id` will receive a `memory` field containing recent tool outputs, enabling tools to make context-aware decisions in multi-step workflows.
**Configuration:**
- `MCP_MEMORY_MAX_ITEMS`: Maximum number of tool outputs to keep per session (default: 10)
- `MCP_MEMORY_TTL_SECONDS`: Time-to-live for memory entries in seconds (default: 900)
---
## Role-Based Access Control (RBAC)
IntegraChat implements fine-grained role-based access control (RBAC) for backend API endpoints. This ensures that users can only access features appropriate for their role level.
### Roles
The system supports four roles with increasing privileges:
1. **viewer** (default) - Basic read-only access
- Can use chat functionality
- Cannot ingest documents
- Cannot delete documents
- Cannot view analytics
- Cannot manage admin rules
2. **editor** - Content management access
- Can use chat functionality
- ✅ Can ingest documents (upload, paste, URLs, files)
- ❌ Cannot delete documents
- ❌ Cannot view analytics
- ❌ Cannot manage admin rules
3. **admin** - Administrative access
- Can use chat functionality
- ✅ Can ingest documents
- ✅ Can delete documents
- ✅ Can view analytics
- ✅ Can manage admin rules
4. **owner** - Full system access
- Same permissions as admin (highest privilege level)
### Permission Matrix
| Action | viewer | editor | admin | owner |
|--------|--------|--------|-------|-------|
| Chat Bot | ✅ | ✅ | ✅ | ✅ |
| Ingest Documents | ❌ | ✅ | ✅ | ✅ |
| Delete Documents | ❌ | ❌ | ✅ | ✅ |
| View Analytics | ✅ | ✅ | ✅ | ✅ |
| Manage Rules | ❌ | ❌ | ✅ | ✅ |
### Backend RBAC
Backend API endpoints enforce RBAC through the `x-user-role` header:
```python
# Permission matrix in backend/mcp_server/common/access_control.py
PERMISSIONS = {
"manage_rules": {"owner", "admin"},
"ingest_documents": {"owner", "admin", "editor"},
"delete_documents": {"owner", "admin"},
"view_analytics": {"owner", "admin"},
}
```
**Protected Endpoints:**
- `/admin/rules` - Requires `admin` or `owner` role
- `/rag/ingest*` - Requires `editor`, `admin`, or `owner` role
- `/rag/delete*` - Requires `admin` or `owner` role
- `/analytics/*` - All roles can view (viewer, editor, admin, owner)
**Role Propagation:**
The user role is automatically propagated through the entire request pipeline:
1. Client sends `x-user-role` header
2. Backend API route receives and validates role
3. Role is passed to service layer (`process_ingestion()`, etc.)
4. Service layer passes role to MCP clients
5. MCP clients include role in payload to MCP server
6. MCP server extracts role and enforces permissions
**Example Request:**
```bash
curl -X POST "http://localhost:8000/admin/rules" \
-H "Content-Type: application/json" \
-H "x-tenant-id: tenant123" \
-H "x-user-role: admin" \
-d '{"rule": "Do not share passwords"}'
```
If the role lacks permission, the API returns `403 Forbidden` with a descriptive error message that includes:
- Which role was used
- Which roles are allowed for the action
- Instructions to change role in the UI
### Using RBAC
1. **Set Role**: Include `x-user-role` header in API requests with one of: `viewer`, `editor`, `admin`, or `owner`
2. **Verify Permissions**: Backend enforces role-based access automatically
3. **Error Handling**: API returns `403 Forbidden` with clear error messages when role lacks required permissions
---
## Real-Time Visualization Features
IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity:
### 1. Reasoning Path Visualizer
- **What it shows**: Step-by-step visualization of how the agent makes decisions
- **Features**:
- Animated progression through reasoning steps
- Status indicators (pending, running, completed, error)
- Detailed metrics per step (latency, hit counts, token estimates)
- Visual icons for each step type
- **Where to find it**:
- Gradio app: Debug & Reasoning tab
- **Data source**: `reasoning_trace` from agent responses
### 2. Tool Invocation Timeline
- **What it shows**: Visual timeline of all tool executions during an agent interaction
- **Features**:
- Color-coded bars showing tool status (success/error)
- Latency visualization per tool
- Result count badges
- Summary statistics (total tools, total time, average latency)
- **Where to find it**:
- Gradio app: Debug & Reasoning tab
- **Data source**: `tool_traces` from agent responses
### 3. Tenant Activity Heatmap
- **What it shows**: Query activity patterns and tool usage trends over time
- **Features**:
- Hour-by-hour, day-by-day activity heatmap
- Color intensity based on activity level
- Per-tool usage trends with bar charts
- Trend indicators (up/down/stable)
- **Where to find it**:
- Gradio app: Admin Analytics tab
- Configurable time window (default: 7 days)
- **Data source**: `/analytics/activity` and `/analytics/tool-usage` endpoints
**Access**: All visualization features are available to all roles (viewer, editor, admin, owner).
---
## Installation & Setup
### Prerequisites
- **Python 3.10+** with pip
- **PostgreSQL** (with pgvector extension) or **Supabase** for RAG storage
- **Supabase** (recommended) or SQLite for admin rules and analytics
- **Ollama** (local) or **Groq API** credentials for LLM
- **Google Custom Search API** (optional, for web search):
- Enable Custom Search API in [Google Cloud Console](https://console.cloud.google.com/)
- Create API key → set as `GOOGLE_SEARCH_API_KEY` in `.env`
- Create Programmable Search Engine → set ID as `GOOGLE_SEARCH_CX_ID` in `.env`
### Step-by-Step Installation
1. **Clone and navigate to the project**:
```bash
cd IntegraChat
```
2. **Create and activate virtual environment** (recommended):
```bash
# Windows
python -m venv venv
venv\Scripts\activate
# Linux/Mac
python3 -m venv venv
source venv/bin/activate
```
3. **Install Python dependencies**:
```bash
pip install -r requirements.txt
```
4. **Configure environment variables**:
```bash
cp env.example .env
# Edit .env with your credentials:
# - SUPABASE_URL and SUPABASE_SERVICE_KEY (for production storage)
# - POSTGRESQL_URL (for RAG vector database)
# - OLLAMA_URL/OLLAMA_MODEL or GROQ_API_KEY (for LLM)
# - GOOGLE_SEARCH_API_KEY and GOOGLE_SEARCH_CX_ID (optional, for web search)
```
5. **Set up Supabase** (recommended for production):
- Create a Supabase project at [supabase.com](https://supabase.com)
- Run `supabase_admin_rules_table.sql` in Supabase SQL Editor
- Run `supabase_analytics_tables.sql` in Supabase SQL Editor
- Copy your project URL and service role key to `.env`
- Verify setup: `python verify_supabase_setup.py`
6. **Start the services**:
**Option A: Windows Quick Start** (recommended for Windows):
```bash
start.bat
```
This automatically starts:
- FastAPI backend on port 8000
- Unified MCP server on port 8900
**Option B: Manual Start**:
```bash
# Terminal 1: FastAPI backend
uvicorn backend.api.main:app --port 8000 --reload
# Terminal 2: Unified MCP server
python backend/mcp_server/server.py
```
7. **Launch the UI**:
**Gradio Interface** (full-featured):
```bash
python app.py
```
Access at `http://localhost:7860`
## Usage
### Gradio Interface (`app.py`)
The Gradio UI provides a comprehensive interface with five main tabs:
#### 1. **Chat** 💬
- Enter your Tenant ID and start chatting with the MCP-powered agent
- Real-time streaming responses (word-by-word using SSE)
- Autonomous tool orchestration (RAG, Web, Admin, LLM)
- Multi-step planning with memory of previous tool outputs
#### 2. **Document Ingestion** 📚
- **Raw Text**: Paste text directly
- **URL**: Ingest content from web URLs
- **File Upload**: Upload PDF, DOCX, TXT, or Markdown files
- Rich metadata support (filename, URL, document ID, custom JSON)
- View and manage ingested documents
#### 3. **Knowledge Base Library** 📖
- **Statistics Dashboard**: Visual cards showing document counts by type
- **Interactive Charts**: Plotly pie chart for document type distribution
- **Semantic Search**: Search knowledge base with relevance scoring
- **Type Filtering**: Filter by document type (text, PDF, FAQ, link)
- **Document Management**: View, preview, and delete documents
- **Auto-refresh**: Lists update automatically after operations
#### 4. **Admin Analytics** 📊
- **Statistics Cards**: Total queries, active users, red flags, RAG searches
- **Interactive Bar Charts**:
- Tool Usage Count (RAG, Web, Admin, LLM)
- Average Tool Latency (performance metrics)
- RAG Quality Metrics (hits, scores, recall indicators)
- **Tool Usage Table**: Detailed performance breakdown
- **Formatted Summary**: Key metrics in easy-to-read format
- Click "🔄 Fetch Analytics Snapshot" to load latest data
#### 5. **Admin Rules & Compliance** 🛡️
- **Text Input**: Paste rules one per line (comments starting with # are ignored)
- **File Upload**: Upload rules from TXT, PDF, DOC, or DOCX files
- **LLM Enhancement**: Automatic rule enhancement (edge cases, pattern improvements, severity suggestions)
- **Chunk Processing**: Large rule sets processed in chunks (5 at a time)
- **Rule-Based Behavior**: Rules checked FIRST - brief responses or blocking based on severity
- **Streaming Responses**: Real-time word-by-word streaming
- **Refresh Button**: Update rules table directly
> **💡 Tip:** Every action requires a Tenant ID. The Tenant ID persists across page refreshes and is managed centrally.
---
## API Endpoints
All endpoints are served by the FastAPI backend at `http://localhost:8000`. Most endpoints require the `x-tenant-id` header for tenant isolation.
> **📖 API Documentation**: Interactive Swagger docs available at `http://localhost:8000/docs` when the backend is running.
### Agent Endpoints
| Method | Endpoint | Description |
| --- | --- | --- |
| `POST` | `/agent/message` | Main chat endpoint with `tenant_id`, `message`, optional history |
| `POST` | `/agent/message/stream` | Streaming chat endpoint using Server-Sent Events (SSE). Returns tokens word-by-word |
| `POST` | `/agent/debug` | Detailed debugging info: reasoning trace, tool selection, intent classification |
| `POST` | `/agent/plan` | Tool selection plan without execution (intent, tool scores, planned steps) |
### RAG Endpoints
| Method | Endpoint | Description |
| --- | --- | --- |
| `POST` | `/rag/ingest-document` | Ingest document with `source_type`, `content`, metadata. Supports raw text, URLs, PDFs, DOCX, TXT, Markdown |
| `POST` | `/rag/ingest-file` | Multipart file upload (PDF/DOCX/TXT/MD) with `x-tenant-id` header |
| `GET` | `/rag/list?tenant_id={id}&limit={n}&offset={n}` | List all documents for a tenant with pagination |
| `DELETE` | `/rag/delete/{document_id}?tenant_id={id}` | Delete a specific document by ID |
| `DELETE` | `/rag/delete-all?tenant_id={id}` | Delete all documents for a tenant |
**Note:** RAG endpoints support both `x-tenant-id` header and `tenant_id` query parameter.
### Admin & Governance Endpoints
| Method | Endpoint | Description |
| --- | --- | --- |
| `GET` | `/admin/rules?detailed=true` | Get all rules (use `detailed=true` for regex/severity metadata) |
| `POST` | `/admin/rules?enhance=true` | Add single rule with optional `pattern` (regex), `severity`, `description`. Set `enhance=true` for LLM enhancement |
| `POST` | `/admin/rules/bulk?enhance=true` | Add multiple rules at once (processed in chunks of 5). LLM enhancement applied automatically |
| `POST` | `/admin/rules/upload-file?enhance=true` | Upload rules from file (TXT, PDF, DOC, DOCX). Text extracted server-side |
| `DELETE` | `/admin/rules/{rule}` | Delete a specific rule |
| `GET` | `/admin/violations?days=30&limit=50` | Get red-flag violations with timestamps and confidence scores |
| `GET` | `/admin/tools/logs?tool_name=rag&days=7` | Get detailed tool usage logs with latency and token counts |
| `GET/POST/DELETE` | `/admin/tenants` | Tenant management endpoints |
| `POST` | `/admin/setup/table` | Create admin_rules table in Supabase if it doesn't exist |
### Analytics Endpoints
| Method | Endpoint | Description |
| --- | --- | --- |
| `GET` | `/analytics/overview?days=30` | Comprehensive analytics: total queries, tool usage, red-flag count, RAG quality |
| `GET` | `/analytics/tool-usage?days=30` | Detailed tool usage stats: counts, latency, tokens, success/error rates |
| `GET` | `/analytics/redflags?limit=50&days=30` | Recent red-flag violations for tenant |
| `GET` | `/analytics/activity?days=30` | Tenant activity summary: queries, active users, last query timestamp |
| `GET` | `/analytics/rag-quality?days=30` | RAG quality metrics: avg hits, scores, latency (recall/precision indicators) |
### Visualization Features
IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity:
#### 1. Real-Time Reasoning Visualizer
- **Location**: Debug tab (Gradio app)
- **Features**:
- Step-by-step visualization of agent reasoning path
- Animated progression through reasoning steps
- Status indicators (pending, running, completed, error)
- Detailed metrics per step (latency, hit counts, token estimates)
- Visual icons for each step type (admin rules check, RAG prefetch, tool selection, etc.)
- **Data Source**: `reasoning_trace` from `/agent/message` or `/agent/debug` endpoints
- **Usage**: Automatically appears in chat panel when agent responses include reasoning traces
#### 2. Tool Invocation Timeline
- **Location**: Debug tab (Gradio app)
- **Features**:
- Visual timeline showing tool execution order
- Color-coded bars indicating tool status (success/error)
- Latency visualization per tool
- Result count badges
- Summary statistics (total tools, total time, average latency)
- **Data Source**: `tool_traces` from `/agent/message` or `/agent/debug` endpoints
- **Usage**: Automatically appears in chat panel when agent responses include tool traces
#### 3. Live Tenant Heatmap
- **Location**: Analytics page (`/analytics`)
- **Features**:
- Query activity heatmap (hour-by-hour, day-by-day visualization)
- Color intensity based on activity level
- Per-tool usage trends with bar charts
- Trend indicators (up/down/stable)
- Configurable time window (default: 7 days)
- **Data Source**: `/analytics/activity` and `/analytics/tool-usage` endpoints
- **Usage**: Navigate to Analytics page to view tenant activity patterns
**Access**: All visualization features are available to all roles (viewer, editor, admin, owner).
### Request Headers
Most endpoints require:
- `x-tenant-id`: Tenant identifier for multi-tenant isolation
- `x-user-role`: Caller role for RBAC enforcement (`viewer`, `editor`, `admin`, or `owner`)
- **Important**: Role must be passed through the entire pipeline (UI → API → RAG Client → MCP Server)
- Role is automatically propagated from the API request to backend API, then to RAG client, and finally to MCP server for permission checks
- If ingestion fails with permission errors, verify the role is set correctly in the UI and check backend logs for role propagation debug messages
- `Content-Type: application/json`: For POST requests with JSON payloads
### Example Request
```bash
curl -X POST http://localhost:8000/agent/message \
-H "Content-Type: application/json" \
-H "x-tenant-id: tenant123" \
-d '{
"message": "What is our refund policy?",
"tenant_id": "tenant123"
}'
```
---
## Architecture
### System Overview
IntegraChat follows a modular architecture with clear separation of concerns:
```
┌─────────────────┐
│ Frontend UI │ (Gradio)
│ Port 7860 │
└────────┬────────┘
┌─────────────────┐
│ FastAPI Backend│ (API Gateway)
│ Port 8000 │
└────────┬────────┘
├──► Unified MCP Server (Port 8900)
│ ├── RAG Tools (search, ingest, list, delete)
│ ├── Web Tools (search)
│ └── Admin Tools (rules, violations)
├──► PostgreSQL/Supabase (RAG Vector Store)
├──► Supabase/SQLite (Rules & Analytics)
└──► LLM Backend (Ollama/Groq)
```
### Enterprise-Grade Features
1. **Autonomous Multi-Step Planning**: LLM-powered planning determines optimal tool sequences with short-term conversation memory that stores and injects previous tool outputs into subsequent tool calls for better context awareness.
2. **Regex-Based Governance**: Admin rules support regex patterns with fallback to keyword matching and semantic similarity scoring for flexible policy enforcement.
3. **Comprehensive Analytics**: All tool usage, RAG searches, LLM calls, and red-flag violations are logged with indexed queries for fast analytics retrieval.
4. **Enhanced RAG Pipeline**: Documents chunked optimally (400-600 tokens) and enriched with metadata (source URL, timestamp, document type) for better retrieval.
5. **Structured Error Handling**: All errors logged with context, with graceful fallbacks (e.g., RAG fails → LLM-only, web fails → skip web).
### Data Storage Architecture
IntegraChat uses **dual-backend storage** with automatic fallback for production flexibility:
#### Supabase (Production/Preferred)
**When to use:** Production deployments, multi-user environments, scalable applications
**Storage:**
- `admin_rules` - Admin rules with regex patterns and severity levels
- `tool_usage_events` - Tool invocation logs with latency and token tracking
- `redflag_violations` - Red-flag violation events with timestamps
- `rag_search_events` - RAG search metrics and quality indicators
- `agent_query_events` - Agent query logs and analytics
**Features:**
- Row Level Security (RLS) for multi-tenant isolation
- Automatic backups and scaling
- Real-time capabilities
- Production-ready infrastructure
**Setup:** Configure `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` in `.env`
#### SQLite (Development Fallback)
**When to use:** Local development, testing, single-user scenarios
**Storage:**
- `data/admin_rules.db` - Admin rules (local file)
- `data/analytics.db` - Analytics events (local file)
**Features:**
- Zero configuration required
- Perfect for local development
- Automatic fallback when Supabase not configured
**Migration:** To migrate existing SQLite data to Supabase, refer to Supabase documentation for data migration strategies.
---
## Supabase Setup & Migration
IntegraChat supports Supabase for production-ready storage of admin rules and analytics. Both `RulesStore` and `AnalyticsStore` automatically detect and use Supabase when credentials are available, falling back to SQLite for local development.
### Quick Setup
1. **Create Supabase tables**:
- Run `supabase_admin_rules_table.sql` in Supabase SQL Editor
- Run `supabase_analytics_tables.sql` in Supabase SQL Editor
2. **Configure environment variables** in `.env`:
```env
SUPABASE_URL=https://your-project-id.supabase.co
SUPABASE_SERVICE_KEY=your_service_role_key_here
```
3. **Verify setup**: Check that your Supabase project is accessible and tables are created correctly.
---
## Troubleshooting
### Common Issues
#### Backend Not Starting
- **Issue**: FastAPI backend fails to start
- **Solution**:
- Check if port 8000 is already in use: `netstat -ano | findstr :8000` (Windows) or `lsof -i :8000` (Linux/Mac)
- Verify Python virtual environment is activated
- Check `.env` file exists and has required variables
- Review error logs for missing dependencies
#### MCP Server Connection Errors
- **Issue**: "Could not connect to MCP server" errors
- **Solution**:
- Ensure unified MCP server is running: `python backend/mcp_server/server.py`
- Check MCP server is on port 8900 (default)
- Verify `MCP_SERVER_ID` in `.env` matches server configuration
- Check firewall settings if running on different machines
#### RAG Search Not Returning Results
- **Issue**: RAG searches return no results despite ingested documents
- **Solution**:
- Check similarity threshold (default 0.3) - try lowering to 0.2 or 0.1
- Verify documents exist: `GET /rag/list?tenant_id={id}`
- Ensure tenant_id matches between ingestion and search
- Check PostgreSQL/pgvector connection and vector extension
- Review MCP server logs for search metrics
#### Supabase Configuration Issues
- **Issue**: Data still going to SQLite instead of Supabase
- **Solution**:
- Verify `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` in `.env` (no quotes, no spaces)
- Use **service_role** key (not anon key) from Supabase Dashboard
- Verify Supabase credentials in `.env` file
- Ensure tables exist: run SQL scripts in Supabase SQL Editor
- Check FastAPI startup logs for backend detection messages
#### LLM Connection Errors
- **Issue**: Agent responses fail with LLM errors
- **Solution**:
- For Ollama: Ensure Ollama is running (`ollama serve`)
- Check `OLLAMA_URL` and `OLLAMA_MODEL` in `.env`
- For Groq: Verify `GROQ_API_KEY` is set correctly
- Check `LLM_BACKEND` setting (ollama or groq)
- Test LLM connection: `curl http://localhost:11434/api/tags` (Ollama)
#### Document Ingestion Failures
- **Issue**: File uploads or document ingestion fails
- **Solution**:
- Check file size limits (default may be 10MB)
- Verify file format is supported (PDF, DOCX, TXT, MD)
- Ensure tenant_id is provided in request
- **Check user role**: Ingestion requires `editor`, `admin`, or `owner` role. If you see "Permission Denied (403)", change your role in the UI dropdown (top right) from "viewer" to "editor", "admin", or "owner"
- Verify `x-user-role` header is being sent correctly (check backend logs for debug messages)
- Check backend logs for specific error messages
- Verify PostgreSQL connection for RAG storage
#### Document Display Issues
- **Issue**: Document list shows `[object Object]` instead of document details
- **Solution**: This has been fixed. Documents now display properly with:
- Document ID (number)
- Document Type (text, pdf, faq, link)
- Preview (first 200 characters)
- Length (character count)
- Created date
- **If still seeing issues**: Refresh the Knowledge Base Library tab
#### Rule Addition Timeouts
- **Issue**: "Chunk 1/1 timed out after 45s" when adding rules
- **Solution**:
- **Quick Fix**: Uncheck the "Enable LLM Enhancement" checkbox before adding rules - rules will be added immediately without LLM processing
- **With Enhancement**: Keep checkbox checked but be patient - enhancement can take up to 180s for 5 rules (30s per rule)
- **Best Practice**: Add rules in smaller batches (1-3 rules at a time) when using enhancement
- **Note**: Enhancement is optional - you can always add rules quickly without it, then enhance them later if needed
#### Rule Deletion Issues
- **Issue**: "404 Not Found" when trying to delete a rule
- **Solution**: You can now delete rules in two ways:
- **By Number**: Enter the rule number (e.g., "1", "2", "3") as shown in the rules table
- **By Text**: Enter the exact rule text as displayed in the rules table
- **If rule not found**: Make sure you're entering the exact text or a valid rule number. Refresh the rules table to see current rules.
#### Tenant Isolation Issues
- **Issue**: Documents or data leaking between tenants
- **Solution**:
- Check database queries include `WHERE tenant_id = ...` filters
- Verify tenant ID normalization is working correctly
- Review database logs for tenant isolation
### Getting Help
1. **Check Logs**: Review FastAPI and MCP server logs for detailed error messages
2. **Run Diagnostics**: Use helper scripts in the Testing & Diagnostics section
3. **Verify Configuration**: Check `.env` file and Supabase connection
4. **Review Documentation**: See `backend/README.md` for backend-specific issues
---
## Testing & Diagnostics
You can test the system by:
- **API Testing**: Use the FastAPI interactive docs at `http://localhost:8000/docs` to test endpoints
- **Database Inspection**: Connect directly to your PostgreSQL/Supabase instance to verify tenant isolation
- **Log Monitoring**: Check FastAPI and MCP server logs for detailed error messages and debugging information
> **Tip:** Ensure the Python virtual environment is active (`source venv/bin/activate` or `.\venv\Scripts\activate`) and that `.env` contains the MCP server URLs/LLM settings.
---
## Demo Video
-**Prerequisites:** FastAPI backend plus all MCP servers (RAG/Web/Admin) running locally.
-**What it checks:**
1. Direct database writes via the analytics and rules stores
2. CRUD over the `/admin/*` and `/analytics/*` endpoints
3. RAG ingestion and isolation by issuing queries as multiple tenants and ensuring secrets never leak across IDs
-**Pass criteria:** At least 80 % of the sub-tests succeed (the RAG isolation test must pass for overall success).
- `python check_rag_database.py`
Provides a low-level inspection of the RAG datastore. It connects straight to the pgvector/Postgres instance, lists all tenant IDs, prints sample chunks, and runs `search_vectors()` directly to ensure the SQL `WHERE tenant_id = …` filter is behaving as expected. Use this script when diagnosing suspected cross-tenant leakage or when seeding demo data.
- `python verify_supabase_setup.py`
Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using. Displays any missing configuration and provides a summary of where data will be saved.
- `python check_supabase_rules.py`
Checks Supabase admin rules configuration and RLS policies. Validates that rules can be read/written correctly.
- `python migrate_sqlite_to_supabase.py`
One-shot migration script that copies existing SQLite data (admin rules + analytics) to Supabase. Supports both PostgreSQL direct connection and Supabase REST API methods.
- `python test_manual.py`
The existing manual test runner remains useful for smoke-testing analytics logging, admin rule CRUD, and API response codes. Run it whenever you adjust schemas or update MCP endpoints.
> **Tip:** Ensure the Python virtual environment is active (`source venv/bin/activate` or `.\venv\Scripts\activate`) and that `.env` contains the MCP server URLs/LLM settings.
---
## Demo Video
🎥 **[Demo Video Placeholder]** - Coming soon!
Watch how IntegraChat uses MCP to power autonomous agents with multi-tool selection, RAG retrieval, and enterprise governance.
---
## Social Media
📱 **[Social Media Post Placeholder]** - Coming soon!
Follow us for updates and demos of IntegraChat in action!
---
## Team Member(s)
- **Your Name Here** - Developer & MCP Enthusiast
---
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
## Technical Stack
### Backend
- **Framework**: FastAPI with async/await for high-performance MCP orchestration
- **MCP Server**: Unified MCP server (port 8900) exposing all tools via namespaces
- **API**: RESTful API with Server-Sent Events (SSE) for streaming responses
- **LLM Integration**:
- Ollama (local, default) - `http://localhost:11434`
- Groq (cloud) - via API key
- Configurable backend with streaming support
### Frontend
- **Gradio UI**: Full-featured interface with Plotly visualizations (`app.py`)
- **UI Libraries**:
- Plotly for interactive charts and visualizations
### Data Storage
- **RAG Vector Store**: PostgreSQL with pgvector extension (via Supabase or direct connection)
- **Analytics**: Supabase (production) or SQLite (development) with indexed queries
- **Rules Storage**: Supabase (production) or SQLite (development) with automatic fallback
- **Database**: PostgreSQL for RAG embeddings, Supabase/SQLite for analytics and rules
### File Processing
- **Supported Formats**: TXT, PDF, DOC, DOCX, Markdown
- **Libraries**: PyPDF2, python-docx for server-side text extraction
- **Metadata**: Rich metadata support (source URL, timestamp, document type)
### Communication
- **Streaming**: Server-Sent Events (SSE) for real-time word-by-word response streaming
- **Protocol**: Model Context Protocol (MCP) for tool communication
- **HTTP**: RESTful endpoints with JSON payloads
## Recent Enhancements
### UI & UX Improvements (Latest)
- **Document Display Fix**: Fixed document list showing `[object Object]` - now properly displays document ID, type, preview, length, and creation date in a formatted table
- **Rule Deletion Enhancement**: Can now delete rules by entering either:
- Rule number (e.g., "1", "2", "3") - automatically finds the corresponding rule
- Full rule text - deletes the exact matching rule
- **LLM Enhancement Toggle**: Added checkbox to enable/disable LLM enhancement when adding rules:
- **Quick Add**: Uncheck to add rules immediately without LLM processing (no timeout issues)
- **Enhanced Add**: Check to get better patterns, explanations, and examples (takes longer but higher quality)
- **Improved Timeouts**: Increased timeout for rule enhancement from 45s to 180s to handle multiple rules properly
- **Better Error Messages**: Clearer error messages for rule deletion, document operations, and permission errors
### Role Propagation & Permission Handling (Latest)
- **Fixed Role Propagation**: User role (`viewer`, `editor`, `admin`, `owner`) is now properly passed through the entire ingestion pipeline:
- UI sends role in `x-user-role` header
- Backend API route receives and validates role
- Role is passed to `process_ingestion()` service
- RAG client includes role in payload to MCP server
- MCP server uses role for permission checks
- **Improved Error Handling**: Permission errors (403 Forbidden) now return clear, actionable error messages:
- Clear indication when role lacks required permissions
- Guidance on which roles can perform specific actions
- Instructions to change role in UI dropdown
- **Debug Logging**: Added comprehensive debug logging to trace role values through the pipeline for troubleshooting
- **Admin Question Handling**: Fixed "who is the admin" type questions to use RAG from knowledge base instead of generic LLM responses
### Admin Rules System (Latest)
- **File Upload Support**: Upload rules from TXT, PDF, DOC, DOCX files with drag-and-drop interface
- **LLM Enhancement Toggle**: Optional LLM enhancement with checkbox control:
- **Quick Add Mode**: Uncheck to add rules immediately without LLM processing (no timeouts)
- **Enhanced Mode**: Check to get better patterns, explanations, examples, and edge case detection
- **LLM Enhancement**: When enabled, automatic rule enhancement identifies edge cases, improves regex patterns, and suggests severity levels
- **Intelligent Fallback Explanations**: When LLM enhancement times out or fails, the system automatically generates basic explanations using keyword extraction, providing useful examples and pattern suggestions without requiring LLM availability
- **Chunk Processing**: Large rule sets processed in chunks of 5 to prevent timeouts (handles 100+ rules efficiently)
- **Enhanced Timeouts**: Increased timeout from 45s to 180s per chunk to accommodate LLM processing
- **Flexible Rule Deletion**: Delete rules by entering either rule number (e.g., "1") or full rule text
- **Comment Filtering**: Comment lines (starting with #) automatically ignored when uploading rules
- **Rule-First Processing**: Admin rules checked before intent classification - enables behavior control (brief responses vs blocking)
- **Supabase Integration**: Production-ready Supabase support with automatic table creation
- **Streaming Responses**: Word-by-word streaming for chat responses using Server-Sent Events (SSE)
### Conversation Memory System (Latest)
- **Short-Term Memory**: Automatic storage of tool outputs per session with configurable size limits and TTL
- **Session-Based Isolation**: Memory keyed by session_id (not tenant_id) for safety
- **Automatic Injection**: Recent memory automatically injected into tool payloads for multi-step workflows
- **Auto-Expiration**: Memory entries expire after configurable TTL (default: 15 minutes)
- **Session Management**: Memory can be explicitly cleared via `end_session` flag
- **Comprehensive Testing**: Full test suite covering memory storage, retrieval, expiration, and multi-step workflows
### AI-Generated KB Metadata & Advanced RAG (Latest)
- **Automatic Metadata Extraction**: When ingesting documents, system auto-extracts:
- **Title**: From filename, URL, or content structure (with intelligent fallback)
- **Summary**: 2-3 sentence summary via LLM (with keyword-based fallback)
- **Tags**: 5-8 relevant tags extracted from content
- **Topics**: 3-5 main themes identified via LLM
- **Date Detection**: Multiple date formats automatically detected
- **Quality Score**: 0.0-1.0 score based on structure and completeness
- **Intelligent Fallback**: When LLM is unavailable or times out, uses keyword extraction and pattern matching to provide useful metadata
- **Database Integration**: Metadata stored in JSONB column for flexible querying and enhanced RAG search
- **Migration Script**: Safe, idempotent database migration script included
### Per-Tool Latency Prediction & Context-Aware Routing (Latest)
- **Latency Prediction**: Agent estimates expected latency before tool selection:
- RAG: 60-120ms (depends on result count)
- Web: 400-1800ms (network-dependent)
- Admin: <20ms (local regex matching)
- LLM: Variable based on model and token count
- **Path Optimization**: Agent chooses fastest tool sequence based on latency estimates
- **Context-Aware Routing**: Intelligent tool skipping based on previous outputs:
- High RAG score (≥0.8) → Skip web search
- Critical admin violation → Skip agent reasoning, immediate block
- Relevant memory available → Skip RAG, use memory instead
- **Routing Hints**: Context hints included in reasoning trace for transparency
- **Performance Impact**: Leads to more sophisticated behavior and higher scores
### Tool Output Schemas (Latest)
- **Strict JSON Schemas**: Every tool returns validated JSON with consistent structure:
- **RAG**: `{results: [...], top_score: float, latency_ms: int}`
- **Web**: `{results: [...], latency_ms: int}`
- **Admin**: `{violations: [...], severity: str, latency_ms: int}`
- **LLM**: `{text: str, tokens_used: int, latency_ms: int}`
- **Automatic Validation**: All tool outputs validated and formatted before use
- **Easier Debugging**: Consistent structure makes debugging and monitoring simpler
- **Polished Responses**: Schema-validated outputs ensure professional appearance
### Cross-Encoder Re-ranking (Latest)
- **Two-Stage RAG Process**:
- Initial vector search retrieves candidates
- Cross-encoder re-ranks top 10 results for accuracy
- Final filtering by threshold and limit
- **Model**: Uses `cross-encoder/ms-marco-MiniLM-L-6-v2` (very fast, production-ready)
- **Massive Accuracy Improvement**: Re-ranking significantly improves relevance of search results
- **Seamless Integration**: Works transparently with existing RAG search API
### Context Engineering (Latest)
- **Anthropic-Inspired Strategies**: Implements best practices from Anthropic's context engineering research:
- **Compaction**: High-fidelity summarization preserving architectural decisions, unresolved issues, and implementation details
- **Tool Result Clearing**: Safest form of compaction - removes large tool outputs once processed
- **Structured Note-Taking**: Tracks objectives (like Claude playing Pokémon), architectural decisions, and unresolved issues
- **XML-Structured Prompts**: All prompts use clear XML sections (`<system>`, `<background_information>`, `<instructions>`) for better model understanding
- **Automatic Compression**: Conversations compressed at 80% token threshold, targeting 60% after compression
- **Just-in-Time Context**: Selects only relevant memories and tools for each query
- **Progressive Disclosure**: Agents discover context incrementally through exploration
- **Benefits**:
- Reduced token usage and costs
- Longer conversation support
- Better agent coherence across extended interactions
- Improved performance through structured context
- **Documentation**: Context engineering features are integrated throughout the agent orchestrator and MCP server
### UI Improvements
- **Modern Drag-and-Drop**: Intuitive file upload with visual feedback
- **Enhanced Status Messages**: Clear success/error messages with icons
- **Refresh Button in Table**: Quick refresh directly from the Rule Set section
- **Better Visual Hierarchy**: Improved spacing, colors, and layout
- **Gradio UI Enhancements**:
- AI metadata displayed after document ingestion
- Latency predictions shown in reasoning trace
- Context-aware routing hints visualized
- Tool output schemas displayed in debug view
## Key Technical Features
### Tenant Isolation & Normalization
- **Strict tenant isolation** enforced at database level with `WHERE tenant_id = ...` filters
- **Automatic tenant ID normalization** handles whitespace and formatting differences
- Documents can be listed and deleted consistently across different tenant_id formats
- All operations validate tenant ownership before execution
### RAG Search & Retrieval
- **Cross-Encoder Re-ranking**: Two-stage retrieval process for massive accuracy improvement:
- First: Vector search retrieves top candidates using embeddings
- Then: Cross-encoder model (`cross-encoder/ms-marco-MiniLM-L-6-v2`) re-ranks top 10 results
- Final: Results filtered by threshold and limit applied
- **Optimized similarity threshold** (default 0.3) for better recall of relevant documents
- **Intelligent fallback** returns top result even if below threshold to ensure knowledge base content is accessible
- **Pattern-based tool selection** automatically triggers RAG for admin questions, fact lookups, and internal knowledge queries
- **Response unwrapping** ensures seamless integration between MCP server and orchestrator
### MCP Server Architecture
- **Unified server** running on a single port (default 8900) for all namespaced tools
- **Dual protocol support**: Both MCP protocol (POST with JSON) and RESTful HTTP (GET/DELETE)
- **Response wrapping**: Standardized response format with automatic unwrapping in clients
- **Error handling**: Comprehensive error responses with detailed messages for debugging
## UI Features
### Knowledge Base Library
- **Visual Statistics**: Real-time document counts and type distribution
- **Interactive Charts**: Plotly pie charts for document type visualization
- **Advanced Search**: Semantic search across all ingested documents with relevance scoring
- **Smart Filtering**: Filter by document type (text, PDF, FAQ, link)
- **Bulk Operations**: Delete individual documents or all documents at once
- **Auto-refresh**: Lists automatically update after operations
### Admin Analytics Dashboard
- **Statistics Cards**: Key metrics displayed in visually appealing cards with icons
- **Tool Usage Visualization**: Bar charts showing tool invocation counts and performance
- **Latency Metrics**: Visual representation of tool response times
- **RAG Quality Analysis**: Charts displaying search quality metrics (hits, scores, recall)
- **Detailed Tables**: Comprehensive tool usage breakdown with success/error rates
- **Dark Theme**: Modern UI with dark background and white text for better readability
- **Real-time Updates**: Fetch latest analytics data with a single click
## Acknowledgments
- Built with [Model Context Protocol (MCP)](https://modelcontextprotocol.io/)
- Powered by [Gradio](https://gradio.app/) for the interface
- Visualizations created with [Plotly](https://plotly.com/python/)
- Backend built with [FastAPI](https://fastapi.tiangolo.com/)
- Analytics and governance features inspired by enterprise AI platform requirements
---
<div align="center">
**Made with ❤️ for the MCP Hackathon**
**IntegraChat: Enterprise-Grade MCP Autonomous Agent Platform**
[⬆ Back to Top](#integrachat--enterprise-mcp-autonomous-agent-platform)
</div>