Spaces:

nothingworry
/

IntegraChat

Sleeping

App Files Files Community

IntegraChat / README.md

nothingworry

chore: clean up README files after repository cleanup

93e2b71 12 days ago

preview code

raw

history blame

51.8 kB

IntegraChat — Enterprise MCP Autonomous Agent Platform

Track: MCP in Action
Category: Enterprise
Tag: mcp-in-action-track-enterprise

Overview

IntegraChat is an enterprise-grade, multi-tenant AI platform that demonstrates the full capabilities of the Model Context Protocol (MCP) in a production-style environment. Built with enterprise governance and observability in mind, IntegraChat combines autonomous tool-using agents, RAG retrieval, live web search, and admin compliance under strict tenant isolation.

This platform showcases how MCP can power intelligent, governed, multi-tenant AI systems with real-time analytics, regex-based red-flag detection, and comprehensive tool orchestration.

🚀 Quick Start

Windows Users

# 1. Install dependencies
pip install -r requirements.txt

# 2. Configure environment (copy and edit .env)
cp env.example .env
# Edit .env with your credentials (Supabase, LLM, etc.)

# 3. Start all services
start.bat

Manual Setup

# 1. Install dependencies
pip install -r requirements.txt

# 2. Configure environment
cp env.example .env
# Edit .env with your credentials

# 3. Start FastAPI backend (Terminal 1)
uvicorn backend.api.main:app --port 8000 --reload

# 4. Start unified MCP server (Terminal 2)
python backend/mcp_server/server.py

# 5. Start Gradio UI (Terminal 3)
python app.py

Then access:

Gradio UI: http://localhost:7860
FastAPI Docs: http://localhost:8000/docs

Security Note: REST requests that hit protected endpoints must include both x-tenant-id and x-user-role headers. Roles (viewer, editor, admin, owner) determine which actions—such as document ingestion, rule uploads, or analytics access—the caller may perform.

Features

Core Capabilities

🤖 Autonomous Multi-Step MCP Agents – Intelligent tool-aware agent that plans and executes multi-step workflows across RAG, Web, Admin, and LLM tools with short-term conversation memory
💭 Short-Term Conversation Memory – Automatic memory system that stores the last N tool outputs per session with configurable expiration (default: 10 outputs, 15 minutes TTL). Memory is keyed by session_id (not tenant_id) for safety, enabling better context awareness in multi-step workflows. Memory is automatically injected into tool payloads and cleared on session end.
📚 Enhanced Knowledge Base Management – Upload raw text, URLs, or documents (PDF/DOCX/TXT/MD) with rich metadata (source URL, timestamp, document type) and optimized chunking (400-600 tokens)
🤖 AI-Generated KB Metadata – Automatic extraction of title, summary, tags, topics, date, and quality score during document ingestion. LLM-powered with intelligent fallback when unavailable - uses keyword extraction and pattern matching to provide useful metadata even during timeouts
🔍 Optimized RAG Search with Cross-Encoder Re-ranking – Two-stage retrieval: initial vector search followed by cross-encoder re-ranking of top candidates using cross-encoder/ms-marco-MiniLM-L-6-v2 for massive accuracy improvement. Semantic search with configurable similarity threshold (default 0.3) for better recall
⚡ Per-Tool Latency Prediction – Agent estimates expected latency before choosing tools (RAG: 60-120ms, Web: 400-1800ms, Admin: <20ms) to optimize tool selection and choose the fastest path
🧠 Context-Aware MCP Routing – Intelligent tool selection based on previous outputs: skip web search if RAG returns high score (≥0.8), skip agent reasoning for critical admin violations, skip RAG if relevant memory already available. Leads to more sophisticated behavior and higher scores
📋 Tool Output Schemas – Every tool returns strict JSON type schemas for easier debugging, cleaner reasoning, and more polished responses. Automatic schema validation and formatting
🗑️ Document Management – Delete individual documents or bulk delete all documents for a tenant with confirmation dialogs
🛡️ Enterprise Admin Governance – Advanced rule management system with:
- Regex-based red-flag pattern matching with severity levels (low/medium/high/critical)
- Automatic admin alerts for violations
- LLM-Enhanced Rules: Rules are automatically analyzed and enhanced to identify edge cases, improve regex patterns, and suggest appropriate severity levels
- LLM-Guided Rule Explanations: Automatic generation of human-readable explanations, concrete examples, and missing pattern suggestions. Includes intelligent fallback when LLM is unavailable - uses keyword extraction to provide useful explanations even during timeouts
- File Upload Support: Upload rules from TXT, PDF, DOC, or DOCX files with drag-and-drop interface
- Chunk Processing: Large rule sets processed in manageable chunks (5 rules at a time) to prevent timeouts
- Rule-Based Behavior Control: Rules checked FIRST - brief response rules return quick answers, blocking rules prevent requests
- Comment Filtering: Comment lines (starting with #) automatically ignored when uploading rules
- Supabase Integration: Rules stored in Supabase for production scalability (with SQLite fallback)
📊 Comprehensive Analytics & Observability – Full tenant-level analytics logging with Supabase backend (SQLite fallback for local dev):
- Tool usage breakdown (RAG, Web, Admin, LLM) with latency and token tracking
- RAG recall/precision indicators (average hits, scores, top scores)
- Per-tenant query volume and active users
- Red-flag violations with timestamps and confidence scores
- LLM token logs and latency metrics
- Real-Time Visualizations: Reasoning path visualizer, tool invocation timeline, and tenant activity heatmap
🌐 Live Web Search – Google Programmable Search (Custom Search API) with tenant-aware MCP tooling
🏢 Multi-Tenant Isolation – Complete tenant isolation with centralized tenant ID management; backend enforces strict isolation for chat, ingestion, and admin ops
🔐 Fine-Grained Role-Based Access Control (RBAC) – Four-tier role system (viewer, editor, admin, owner) with backend permission enforcement
🔄 Intelligent Multi-Tool Orchestration – MCP agent orchestrator autonomously selects optimal tool chains (RAG + Web + LLM, etc.) based on query intent, context, latency predictions, and previous tool outputs. Context-aware routing enables sophisticated tool skipping for efficiency
⚡ Robust Error Handling – Structured error responses, retry mechanisms, and graceful fallbacks (e.g., if RAG fails → fallback to LLM-only)
📡 Streaming Responses – Chat responses stream character-by-character using Server-Sent Events (SSE) for real-time user experience
🎯 Rule-First Processing – Admin rules checked before intent classification - rules can trigger brief responses or block requests entirely
🧠 Advanced Context Engineering – Implements Anthropic's context engineering strategies:
- High-Fidelity Compaction: Automatically compresses conversations at 80% token threshold, preserving architectural decisions and unresolved issues
- Tool Result Clearing: Safest form of compaction - removes large tool outputs while keeping metadata
- Structured Note-Taking: Tracks objectives, architectural decisions, and unresolved issues outside context window
- XML-Structured Prompts: All prompts use clear XML sections for better model understanding
- Just-in-Time Context Loading: Selects only relevant memories and tools for each query
- Progressive Disclosure: Agents discover context incrementally through exploration

Enterprise Features

🔍 Regex-Based Red-Flag Detection – Support for complex regex patterns with keyword fallback and semantic scoring
🤖 LLM-Enhanced Rule Management – Rules automatically enhanced by LLM to identify edge cases, improve patterns, and suggest severity levels. Includes intelligent fallback explanations when LLM is unavailable - uses keyword extraction to generate useful explanations, examples, and pattern suggestions even during timeouts
📄 File Upload & Drag-and-Drop – Upload rules from files (TXT, PDF, DOC, DOCX) with intuitive drag-and-drop interface
⚡ Chunk-Wise Processing – Large rule sets processed in chunks to prevent timeouts and ensure reliable processing
📈 Real-Time Analytics Dashboard – Per-tenant analytics with configurable time windows (7, 30, 90 days)
🛠️ Admin API Endpoints – /admin/violations, /admin/tools/logs, /admin/tenants for comprehensive governance
🧠 Agent Debug & Planning – /agent/debug and /agent/plan endpoints for observability and tool selection inspection
💾 Persistent Analytics Storage – Supabase-backed analytics store (with automatic SQLite fallback) for fast, multi-tenant queries
🗄️ Supabase Integration – Production-ready Supabase support for admin rules with automatic table creation
📈 Real-Time Visualization Components – Interactive visualizations for agent reasoning, tool execution, and tenant activity:
- Reasoning Path Visualizer: Step-by-step visualization of agent decision-making with animated progression
- Tool Invocation Timeline: Visual timeline showing tool execution order, latency, and result counts
- Tenant Activity Heatmap: Query activity heatmap and per-tool usage trends over time

Conversation Memory System

IntegraChat includes a short-term conversation memory system that enhances multi-step workflows by maintaining context across tool calls:

Automatic Storage: Every tool output is automatically stored in memory for the session
Bounded Size: Keeps only the last N tool outputs (configurable via MCP_MEMORY_MAX_ITEMS, default: 10)
Auto-Expiration: Entries automatically expire after a configurable TTL (via MCP_MEMORY_TTL_SECONDS, default: 900 seconds / 15 minutes)
Session-Based: Memory is keyed by session_id (not tenant_id) for safety and isolation
Automatic Injection: Recent memory is automatically injected into tool payloads as a memory field for multi-step workflows
Session Clearing: Memory can be explicitly cleared by sending end_session: true or endSession: true in the payload

Usage Example:

{
  "tenant_id": "acme",
  "session_id": "chat-abc-123",
  "query": "Search for X"
}

Subsequent tool calls with the same session_id will receive a memory field containing recent tool outputs, enabling tools to make context-aware decisions in multi-step workflows.

Configuration:

MCP_MEMORY_MAX_ITEMS: Maximum number of tool outputs to keep per session (default: 10)
MCP_MEMORY_TTL_SECONDS: Time-to-live for memory entries in seconds (default: 900)

Role-Based Access Control (RBAC)

IntegraChat implements fine-grained role-based access control (RBAC) for backend API endpoints. This ensures that users can only access features appropriate for their role level.

Roles

The system supports four roles with increasing privileges:

viewer (default) - Basic read-only access
- Can use chat functionality
- Cannot ingest documents
- Cannot delete documents
- Cannot view analytics
- Cannot manage admin rules
editor - Content management access
- Can use chat functionality
- ✅ Can ingest documents (upload, paste, URLs, files)
- ❌ Cannot delete documents
- ❌ Cannot view analytics
- ❌ Cannot manage admin rules
admin - Administrative access
- Can use chat functionality
- ✅ Can ingest documents
- ✅ Can delete documents
- ✅ Can view analytics
- ✅ Can manage admin rules
owner - Full system access
- Same permissions as admin (highest privilege level)

Permission Matrix

Action	viewer	editor	admin	owner
Chat Bot	✅	✅	✅	✅
Ingest Documents	❌	✅	✅	✅
Delete Documents	❌	❌	✅	✅
View Analytics	✅	✅	✅	✅
Manage Rules	❌	❌	✅	✅

Backend RBAC

Backend API endpoints enforce RBAC through the x-user-role header:

# Permission matrix in backend/mcp_server/common/access_control.py
PERMISSIONS = {
    "manage_rules": {"owner", "admin"},
    "ingest_documents": {"owner", "admin", "editor"},
    "delete_documents": {"owner", "admin"},
    "view_analytics": {"owner", "admin"},
}

Protected Endpoints:

/admin/rules - Requires admin or owner role
/rag/ingest* - Requires editor, admin, or owner role
/rag/delete* - Requires admin or owner role
/analytics/* - All roles can view (viewer, editor, admin, owner)

Role Propagation: The user role is automatically propagated through the entire request pipeline:

Client sends x-user-role header
Backend API route receives and validates role
Role is passed to service layer (process_ingestion(), etc.)
Service layer passes role to MCP clients
MCP clients include role in payload to MCP server
MCP server extracts role and enforces permissions

Example Request:

curl -X POST "http://localhost:8000/admin/rules" \
  -H "Content-Type: application/json" \
  -H "x-tenant-id: tenant123" \
  -H "x-user-role: admin" \
  -d '{"rule": "Do not share passwords"}'

If the role lacks permission, the API returns 403 Forbidden with a descriptive error message that includes:

Which role was used
Which roles are allowed for the action
Instructions to change role in the UI

Using RBAC

Set Role: Include x-user-role header in API requests with one of: viewer, editor, admin, or owner
Verify Permissions: Backend enforces role-based access automatically
Error Handling: API returns 403 Forbidden with clear error messages when role lacks required permissions

Real-Time Visualization Features

IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity:

1. Reasoning Path Visualizer

What it shows: Step-by-step visualization of how the agent makes decisions
Features:
- Animated progression through reasoning steps
- Status indicators (pending, running, completed, error)
- Detailed metrics per step (latency, hit counts, token estimates)
- Visual icons for each step type
Where to find it:
- Gradio app: Debug & Reasoning tab
Data source: reasoning_trace from agent responses

2. Tool Invocation Timeline

What it shows: Visual timeline of all tool executions during an agent interaction
Features:
- Color-coded bars showing tool status (success/error)
- Latency visualization per tool
- Result count badges
- Summary statistics (total tools, total time, average latency)
Where to find it:
- Gradio app: Debug & Reasoning tab
Data source: tool_traces from agent responses

3. Tenant Activity Heatmap

What it shows: Query activity patterns and tool usage trends over time
Features:
- Hour-by-hour, day-by-day activity heatmap
- Color intensity based on activity level
- Per-tool usage trends with bar charts
- Trend indicators (up/down/stable)
Where to find it:
- Gradio app: Admin Analytics tab
- Configurable time window (default: 7 days)
Data source: /analytics/activity and /analytics/tool-usage endpoints

Access: All visualization features are available to all roles (viewer, editor, admin, owner).

Installation & Setup

Prerequisites

Python 3.10+ with pip
PostgreSQL (with pgvector extension) or Supabase for RAG storage
Supabase (recommended) or SQLite for admin rules and analytics
Ollama (local) or Groq API credentials for LLM
Google Custom Search API (optional, for web search):
- Enable Custom Search API in Google Cloud Console
- Create API key → set as GOOGLE_SEARCH_API_KEY in .env
- Create Programmable Search Engine → set ID as GOOGLE_SEARCH_CX_ID in .env

Step-by-Step Installation

Clone and navigate to the project:
```
cd IntegraChat
```

Create and activate virtual environment (recommended):

# Windows
python -m venv venv
venv\Scripts\activate

# Linux/Mac
python3 -m venv venv
source venv/bin/activate

Install Python dependencies:
```
pip install -r requirements.txt
```

Configure environment variables:

cp env.example .env
# Edit .env with your credentials:
# - SUPABASE_URL and SUPABASE_SERVICE_KEY (for production storage)
# - POSTGRESQL_URL (for RAG vector database)
# - OLLAMA_URL/OLLAMA_MODEL or GROQ_API_KEY (for LLM)
# - GOOGLE_SEARCH_API_KEY and GOOGLE_SEARCH_CX_ID (optional, for web search)

Set up Supabase (recommended for production):
- Create a Supabase project at supabase.com
- Run supabase_admin_rules_table.sql in Supabase SQL Editor
- Run supabase_analytics_tables.sql in Supabase SQL Editor
- Copy your project URL and service role key to .env
- Verify setup: python verify_supabase_setup.py
Start the services:

Option A: Windows Quick Start (recommended for Windows):
```
start.bat
```
This automatically starts:
- FastAPI backend on port 8000
- Unified MCP server on port 8900
Option B: Manual Start:
```
# Terminal 1: FastAPI backend
uvicorn backend.api.main:app --port 8000 --reload

# Terminal 2: Unified MCP server
python backend/mcp_server/server.py
```
Launch the UI:

Gradio Interface (full-featured):
```
python app.py
```
Access at http://localhost:7860

Usage

Gradio Interface (`app.py`)

The Gradio UI provides a comprehensive interface with five main tabs:

1. Chat 💬

Enter your Tenant ID and start chatting with the MCP-powered agent
Real-time streaming responses (word-by-word using SSE)
Autonomous tool orchestration (RAG, Web, Admin, LLM)
Multi-step planning with memory of previous tool outputs

2. Document Ingestion 📚

Raw Text: Paste text directly
URL: Ingest content from web URLs
File Upload: Upload PDF, DOCX, TXT, or Markdown files
Rich metadata support (filename, URL, document ID, custom JSON)
View and manage ingested documents

3. Knowledge Base Library 📖

Statistics Dashboard: Visual cards showing document counts by type
Interactive Charts: Plotly pie chart for document type distribution
Semantic Search: Search knowledge base with relevance scoring
Type Filtering: Filter by document type (text, PDF, FAQ, link)
Document Management: View, preview, and delete documents
Auto-refresh: Lists update automatically after operations

4. Admin Analytics 📊

Statistics Cards: Total queries, active users, red flags, RAG searches
Interactive Bar Charts:
- Tool Usage Count (RAG, Web, Admin, LLM)
- Average Tool Latency (performance metrics)
- RAG Quality Metrics (hits, scores, recall indicators)
Tool Usage Table: Detailed performance breakdown
Formatted Summary: Key metrics in easy-to-read format
Click "🔄 Fetch Analytics Snapshot" to load latest data

5. Admin Rules & Compliance 🛡️

Text Input: Paste rules one per line (comments starting with # are ignored)
File Upload: Upload rules from TXT, PDF, DOC, or DOCX files
LLM Enhancement: Automatic rule enhancement (edge cases, pattern improvements, severity suggestions)
Chunk Processing: Large rule sets processed in chunks (5 at a time)
Rule-Based Behavior: Rules checked FIRST - brief responses or blocking based on severity
Streaming Responses: Real-time word-by-word streaming
Refresh Button: Update rules table directly

💡 Tip: Every action requires a Tenant ID. The Tenant ID persists across page refreshes and is managed centrally.

API Endpoints

All endpoints are served by the FastAPI backend at http://localhost:8000. Most endpoints require the x-tenant-id header for tenant isolation.

📖 API Documentation: Interactive Swagger docs available at http://localhost:8000/docs when the backend is running.

Agent Endpoints

Method	Endpoint	Description
`POST`	`/agent/message`	Main chat endpoint with `tenant_id`, `message`, optional history
`POST`	`/agent/message/stream`	Streaming chat endpoint using Server-Sent Events (SSE). Returns tokens word-by-word
`POST`	`/agent/debug`	Detailed debugging info: reasoning trace, tool selection, intent classification
`POST`	`/agent/plan`	Tool selection plan without execution (intent, tool scores, planned steps)

RAG Endpoints

Method	Endpoint	Description
`POST`	`/rag/ingest-document`	Ingest document with `source_type`, `content`, metadata. Supports raw text, URLs, PDFs, DOCX, TXT, Markdown
`POST`	`/rag/ingest-file`	Multipart file upload (PDF/DOCX/TXT/MD) with `x-tenant-id` header
`GET`	`/rag/list?tenant_id={id}&limit={n}&offset={n}`	List all documents for a tenant with pagination
`DELETE`	`/rag/delete/{document_id}?tenant_id={id}`	Delete a specific document by ID
`DELETE`	`/rag/delete-all?tenant_id={id}`	Delete all documents for a tenant

Note: RAG endpoints support both x-tenant-id header and tenant_id query parameter.

Admin & Governance Endpoints

Method	Endpoint	Description
`GET`	`/admin/rules?detailed=true`	Get all rules (use `detailed=true` for regex/severity metadata)
`POST`	`/admin/rules?enhance=true`	Add single rule with optional `pattern` (regex), `severity`, `description`. Set `enhance=true` for LLM enhancement
`POST`	`/admin/rules/bulk?enhance=true`	Add multiple rules at once (processed in chunks of 5). LLM enhancement applied automatically
`POST`	`/admin/rules/upload-file?enhance=true`	Upload rules from file (TXT, PDF, DOC, DOCX). Text extracted server-side
`DELETE`	`/admin/rules/{rule}`	Delete a specific rule
`GET`	`/admin/violations?days=30&limit=50`	Get red-flag violations with timestamps and confidence scores
`GET`	`/admin/tools/logs?tool_name=rag&days=7`	Get detailed tool usage logs with latency and token counts
`GET/POST/DELETE`	`/admin/tenants`	Tenant management endpoints
`POST`	`/admin/setup/table`	Create admin_rules table in Supabase if it doesn't exist

Analytics Endpoints

Method	Endpoint	Description
`GET`	`/analytics/overview?days=30`	Comprehensive analytics: total queries, tool usage, red-flag count, RAG quality
`GET`	`/analytics/tool-usage?days=30`	Detailed tool usage stats: counts, latency, tokens, success/error rates
`GET`	`/analytics/redflags?limit=50&days=30`	Recent red-flag violations for tenant
`GET`	`/analytics/activity?days=30`	Tenant activity summary: queries, active users, last query timestamp
`GET`	`/analytics/rag-quality?days=30`	RAG quality metrics: avg hits, scores, latency (recall/precision indicators)

Visualization Features

IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity:

1. Real-Time Reasoning Visualizer

Location: Debug tab (Gradio app)
Features:
- Step-by-step visualization of agent reasoning path
- Animated progression through reasoning steps
- Status indicators (pending, running, completed, error)
- Detailed metrics per step (latency, hit counts, token estimates)
- Visual icons for each step type (admin rules check, RAG prefetch, tool selection, etc.)
Data Source: reasoning_trace from /agent/message or /agent/debug endpoints
Usage: Automatically appears in chat panel when agent responses include reasoning traces

2. Tool Invocation Timeline

Location: Debug tab (Gradio app)
Features:
- Visual timeline showing tool execution order
- Color-coded bars indicating tool status (success/error)
- Latency visualization per tool
- Result count badges
- Summary statistics (total tools, total time, average latency)
Data Source: tool_traces from /agent/message or /agent/debug endpoints
Usage: Automatically appears in chat panel when agent responses include tool traces

3. Live Tenant Heatmap

Location: Analytics page (/analytics)
Features:
- Query activity heatmap (hour-by-hour, day-by-day visualization)
- Color intensity based on activity level
- Per-tool usage trends with bar charts
- Trend indicators (up/down/stable)
- Configurable time window (default: 7 days)
Data Source: /analytics/activity and /analytics/tool-usage endpoints
Usage: Navigate to Analytics page to view tenant activity patterns

Access: All visualization features are available to all roles (viewer, editor, admin, owner).

Request Headers

Most endpoints require:

x-tenant-id: Tenant identifier for multi-tenant isolation
x-user-role: Caller role for RBAC enforcement (viewer, editor, admin, or owner)
- Important: Role must be passed through the entire pipeline (UI → API → RAG Client → MCP Server)
- Role is automatically propagated from the API request to backend API, then to RAG client, and finally to MCP server for permission checks
- If ingestion fails with permission errors, verify the role is set correctly in the UI and check backend logs for role propagation debug messages
Content-Type: application/json: For POST requests with JSON payloads

Example Request

curl -X POST http://localhost:8000/agent/message \
  -H "Content-Type: application/json" \
  -H "x-tenant-id: tenant123" \
  -d '{
    "message": "What is our refund policy?",
    "tenant_id": "tenant123"
  }'

Architecture

System Overview

IntegraChat follows a modular architecture with clear separation of concerns:

┌─────────────────┐
│   Frontend UI   │  (Gradio)
│    Port 7860    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  FastAPI Backend│  (API Gateway)
│    Port 8000    │
└────────┬────────┘
         │
         ├──► Unified MCP Server (Port 8900)
         │    ├── RAG Tools (search, ingest, list, delete)
         │    ├── Web Tools (search)
         │    └── Admin Tools (rules, violations)
         │
         ├──► PostgreSQL/Supabase (RAG Vector Store)
         ├──► Supabase/SQLite (Rules & Analytics)
         └──► LLM Backend (Ollama/Groq)

Enterprise-Grade Features

Autonomous Multi-Step Planning: LLM-powered planning determines optimal tool sequences with short-term conversation memory that stores and injects previous tool outputs into subsequent tool calls for better context awareness.
Regex-Based Governance: Admin rules support regex patterns with fallback to keyword matching and semantic similarity scoring for flexible policy enforcement.
Comprehensive Analytics: All tool usage, RAG searches, LLM calls, and red-flag violations are logged with indexed queries for fast analytics retrieval.
Enhanced RAG Pipeline: Documents chunked optimally (400-600 tokens) and enriched with metadata (source URL, timestamp, document type) for better retrieval.
Structured Error Handling: All errors logged with context, with graceful fallbacks (e.g., RAG fails → LLM-only, web fails → skip web).

Data Storage Architecture

IntegraChat uses dual-backend storage with automatic fallback for production flexibility:

Supabase (Production/Preferred)

When to use: Production deployments, multi-user environments, scalable applications

Storage:

admin_rules - Admin rules with regex patterns and severity levels
tool_usage_events - Tool invocation logs with latency and token tracking
redflag_violations - Red-flag violation events with timestamps
rag_search_events - RAG search metrics and quality indicators
agent_query_events - Agent query logs and analytics

Features:

Row Level Security (RLS) for multi-tenant isolation
Automatic backups and scaling
Real-time capabilities
Production-ready infrastructure

Setup: Configure SUPABASE_URL and SUPABASE_SERVICE_KEY in .env

SQLite (Development Fallback)

When to use: Local development, testing, single-user scenarios

Storage:

data/admin_rules.db - Admin rules (local file)
data/analytics.db - Analytics events (local file)

Features:

Zero configuration required
Perfect for local development
Automatic fallback when Supabase not configured

Migration: To migrate existing SQLite data to Supabase, refer to Supabase documentation for data migration strategies.

Supabase Setup & Migration

IntegraChat supports Supabase for production-ready storage of admin rules and analytics. Both RulesStore and AnalyticsStore automatically detect and use Supabase when credentials are available, falling back to SQLite for local development.

Quick Setup

Create Supabase tables:
- Run supabase_admin_rules_table.sql in Supabase SQL Editor
- Run supabase_analytics_tables.sql in Supabase SQL Editor

Configure environment variables in .env:

SUPABASE_URL=https://your-project-id.supabase.co
SUPABASE_SERVICE_KEY=your_service_role_key_here

Verify setup: Check that your Supabase project is accessible and tables are created correctly.

Troubleshooting

Common Issues

Backend Not Starting

Issue: FastAPI backend fails to start
Solution:
- Check if port 8000 is already in use: netstat -ano | findstr :8000 (Windows) or lsof -i :8000 (Linux/Mac)
- Verify Python virtual environment is activated
- Check .env file exists and has required variables
- Review error logs for missing dependencies

MCP Server Connection Errors

Issue: "Could not connect to MCP server" errors
Solution:
- Ensure unified MCP server is running: python backend/mcp_server/server.py
- Check MCP server is on port 8900 (default)
- Verify MCP_SERVER_ID in .env matches server configuration
- Check firewall settings if running on different machines

RAG Search Not Returning Results

Issue: RAG searches return no results despite ingested documents
Solution:
- Check similarity threshold (default 0.3) - try lowering to 0.2 or 0.1
- Verify documents exist: GET /rag/list?tenant_id={id}
- Ensure tenant_id matches between ingestion and search
- Check PostgreSQL/pgvector connection and vector extension
- Review MCP server logs for search metrics

Supabase Configuration Issues

Issue: Data still going to SQLite instead of Supabase
Solution:
- Verify SUPABASE_URL and SUPABASE_SERVICE_KEY in .env (no quotes, no spaces)
- Use service_role key (not anon key) from Supabase Dashboard
- Verify Supabase credentials in .env file
- Ensure tables exist: run SQL scripts in Supabase SQL Editor
- Check FastAPI startup logs for backend detection messages

LLM Connection Errors

Issue: Agent responses fail with LLM errors
Solution:
- For Ollama: Ensure Ollama is running (ollama serve)
- Check OLLAMA_URL and OLLAMA_MODEL in .env
- For Groq: Verify GROQ_API_KEY is set correctly
- Check LLM_BACKEND setting (ollama or groq)
- Test LLM connection: curl http://localhost:11434/api/tags (Ollama)

Document Ingestion Failures

Issue: File uploads or document ingestion fails
Solution:
- Check file size limits (default may be 10MB)
- Verify file format is supported (PDF, DOCX, TXT, MD)
- Ensure tenant_id is provided in request
- Check user role: Ingestion requires editor, admin, or owner role. If you see "Permission Denied (403)", change your role in the UI dropdown (top right) from "viewer" to "editor", "admin", or "owner"
- Verify x-user-role header is being sent correctly (check backend logs for debug messages)
- Check backend logs for specific error messages
- Verify PostgreSQL connection for RAG storage

Document Display Issues

Issue: Document list shows [object Object] instead of document details
Solution: This has been fixed. Documents now display properly with:
- Document ID (number)
- Document Type (text, pdf, faq, link)
- Preview (first 200 characters)
- Length (character count)
- Created date
If still seeing issues: Refresh the Knowledge Base Library tab

Rule Addition Timeouts

Issue: "Chunk 1/1 timed out after 45s" when adding rules
Solution:
- Quick Fix: Uncheck the "Enable LLM Enhancement" checkbox before adding rules - rules will be added immediately without LLM processing
- With Enhancement: Keep checkbox checked but be patient - enhancement can take up to 180s for 5 rules (30s per rule)
- Best Practice: Add rules in smaller batches (1-3 rules at a time) when using enhancement
Note: Enhancement is optional - you can always add rules quickly without it, then enhance them later if needed

Rule Deletion Issues

Issue: "404 Not Found" when trying to delete a rule
Solution: You can now delete rules in two ways:
- By Number: Enter the rule number (e.g., "1", "2", "3") as shown in the rules table
- By Text: Enter the exact rule text as displayed in the rules table
If rule not found: Make sure you're entering the exact text or a valid rule number. Refresh the rules table to see current rules.

Tenant Isolation Issues

Issue: Documents or data leaking between tenants
Solution:
- Check database queries include WHERE tenant_id = ... filters
- Verify tenant ID normalization is working correctly
- Review database logs for tenant isolation

Getting Help

Check Logs: Review FastAPI and MCP server logs for detailed error messages
Run Diagnostics: Use helper scripts in the Testing & Diagnostics section
Verify Configuration: Check .env file and Supabase connection
Review Documentation: See backend/README.md for backend-specific issues

Testing & Diagnostics

You can test the system by:

API Testing: Use the FastAPI interactive docs at http://localhost:8000/docs to test endpoints
Database Inspection: Connect directly to your PostgreSQL/Supabase instance to verify tenant isolation
Log Monitoring: Check FastAPI and MCP server logs for detailed error messages and debugging information

Tip: Ensure the Python virtual environment is active (source venv/bin/activate or .\venv\Scripts\activate) and that .env contains the MCP server URLs/LLM settings.

Demo Video

✅ Prerequisites: FastAPI backend plus all MCP servers (RAG/Web/Admin) running locally.
✅ What it checks:
1. Direct database writes via the analytics and rules stores
2. CRUD over the /admin/* and /analytics/* endpoints
3. RAG ingestion and isolation by issuing queries as multiple tenants and ensuring secrets never leak across IDs
✅ Pass criteria: At least 80 % of the sub-tests succeed (the RAG isolation test must pass for overall success).
python check_rag_database.py
Provides a low-level inspection of the RAG datastore. It connects straight to the pgvector/Postgres instance, lists all tenant IDs, prints sample chunks, and runs search_vectors() directly to ensure the SQL WHERE tenant_id = … filter is behaving as expected. Use this script when diagnosing suspected cross-tenant leakage or when seeding demo data.
python verify_supabase_setup.py
Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using. Displays any missing configuration and provides a summary of where data will be saved.
python check_supabase_rules.py
Checks Supabase admin rules configuration and RLS policies. Validates that rules can be read/written correctly.
python migrate_sqlite_to_supabase.py
One-shot migration script that copies existing SQLite data (admin rules + analytics) to Supabase. Supports both PostgreSQL direct connection and Supabase REST API methods.
python test_manual.py
The existing manual test runner remains useful for smoke-testing analytics logging, admin rule CRUD, and API response codes. Run it whenever you adjust schemas or update MCP endpoints.

Tip: Ensure the Python virtual environment is active (source venv/bin/activate or .\venv\Scripts\activate) and that .env contains the MCP server URLs/LLM settings.

Demo Video

🎥 [Demo Video Placeholder] - Coming soon!

Watch how IntegraChat uses MCP to power autonomous agents with multi-tool selection, RAG retrieval, and enterprise governance.

Social Media

📱 [Social Media Post Placeholder] - Coming soon!

Team Member(s)

Your Name Here - Developer & MCP Enthusiast

License

This project is licensed under the MIT License - see the LICENSE file for details.

Technical Stack

Backend

Framework: FastAPI with async/await for high-performance MCP orchestration
MCP Server: Unified MCP server (port 8900) exposing all tools via namespaces
API: RESTful API with Server-Sent Events (SSE) for streaming responses
LLM Integration:
- Ollama (local, default) - http://localhost:11434
- Groq (cloud) - via API key
- Configurable backend with streaming support

Frontend

Gradio UI: Full-featured interface with Plotly visualizations (app.py)
UI Libraries:
- Plotly for interactive charts and visualizations

Data Storage

RAG Vector Store: PostgreSQL with pgvector extension (via Supabase or direct connection)
Analytics: Supabase (production) or SQLite (development) with indexed queries
Rules Storage: Supabase (production) or SQLite (development) with automatic fallback
Database: PostgreSQL for RAG embeddings, Supabase/SQLite for analytics and rules

File Processing

Supported Formats: TXT, PDF, DOC, DOCX, Markdown
Libraries: PyPDF2, python-docx for server-side text extraction
Metadata: Rich metadata support (source URL, timestamp, document type)

Communication

Streaming: Server-Sent Events (SSE) for real-time word-by-word response streaming
Protocol: Model Context Protocol (MCP) for tool communication
HTTP: RESTful endpoints with JSON payloads

Recent Enhancements

UI & UX Improvements (Latest)

Document Display Fix: Fixed document list showing [object Object] - now properly displays document ID, type, preview, length, and creation date in a formatted table
Rule Deletion Enhancement: Can now delete rules by entering either:
- Rule number (e.g., "1", "2", "3") - automatically finds the corresponding rule
- Full rule text - deletes the exact matching rule
LLM Enhancement Toggle: Added checkbox to enable/disable LLM enhancement when adding rules:
- Quick Add: Uncheck to add rules immediately without LLM processing (no timeout issues)
- Enhanced Add: Check to get better patterns, explanations, and examples (takes longer but higher quality)
Improved Timeouts: Increased timeout for rule enhancement from 45s to 180s to handle multiple rules properly
Better Error Messages: Clearer error messages for rule deletion, document operations, and permission errors

Role Propagation & Permission Handling (Latest)

Fixed Role Propagation: User role (viewer, editor, admin, owner) is now properly passed through the entire ingestion pipeline:
- UI sends role in x-user-role header
- Backend API route receives and validates role
- Role is passed to process_ingestion() service
- RAG client includes role in payload to MCP server
- MCP server uses role for permission checks
Improved Error Handling: Permission errors (403 Forbidden) now return clear, actionable error messages:
- Clear indication when role lacks required permissions
- Guidance on which roles can perform specific actions
- Instructions to change role in UI dropdown
Debug Logging: Added comprehensive debug logging to trace role values through the pipeline for troubleshooting
Admin Question Handling: Fixed "who is the admin" type questions to use RAG from knowledge base instead of generic LLM responses

Admin Rules System (Latest)

File Upload Support: Upload rules from TXT, PDF, DOC, DOCX files with drag-and-drop interface
LLM Enhancement Toggle: Optional LLM enhancement with checkbox control:
- Quick Add Mode: Uncheck to add rules immediately without LLM processing (no timeouts)
- Enhanced Mode: Check to get better patterns, explanations, examples, and edge case detection
LLM Enhancement: When enabled, automatic rule enhancement identifies edge cases, improves regex patterns, and suggests severity levels
Intelligent Fallback Explanations: When LLM enhancement times out or fails, the system automatically generates basic explanations using keyword extraction, providing useful examples and pattern suggestions without requiring LLM availability
Chunk Processing: Large rule sets processed in chunks of 5 to prevent timeouts (handles 100+ rules efficiently)
Enhanced Timeouts: Increased timeout from 45s to 180s per chunk to accommodate LLM processing
Flexible Rule Deletion: Delete rules by entering either rule number (e.g., "1") or full rule text
Comment Filtering: Comment lines (starting with #) automatically ignored when uploading rules
Rule-First Processing: Admin rules checked before intent classification - enables behavior control (brief responses vs blocking)
Supabase Integration: Production-ready Supabase support with automatic table creation
Streaming Responses: Word-by-word streaming for chat responses using Server-Sent Events (SSE)

Conversation Memory System (Latest)

Short-Term Memory: Automatic storage of tool outputs per session with configurable size limits and TTL
Session-Based Isolation: Memory keyed by session_id (not tenant_id) for safety
Automatic Injection: Recent memory automatically injected into tool payloads for multi-step workflows
Auto-Expiration: Memory entries expire after configurable TTL (default: 15 minutes)
Session Management: Memory can be explicitly cleared via end_session flag
Comprehensive Testing: Full test suite covering memory storage, retrieval, expiration, and multi-step workflows

AI-Generated KB Metadata & Advanced RAG (Latest)

Automatic Metadata Extraction: When ingesting documents, system auto-extracts:
- Title: From filename, URL, or content structure (with intelligent fallback)
- Summary: 2-3 sentence summary via LLM (with keyword-based fallback)
- Tags: 5-8 relevant tags extracted from content
- Topics: 3-5 main themes identified via LLM
- Date Detection: Multiple date formats automatically detected
- Quality Score: 0.0-1.0 score based on structure and completeness
Intelligent Fallback: When LLM is unavailable or times out, uses keyword extraction and pattern matching to provide useful metadata
Database Integration: Metadata stored in JSONB column for flexible querying and enhanced RAG search
Migration Script: Safe, idempotent database migration script included

Per-Tool Latency Prediction & Context-Aware Routing (Latest)

Latency Prediction: Agent estimates expected latency before tool selection:
- RAG: 60-120ms (depends on result count)
- Web: 400-1800ms (network-dependent)
- Admin: <20ms (local regex matching)
- LLM: Variable based on model and token count
Path Optimization: Agent chooses fastest tool sequence based on latency estimates
Context-Aware Routing: Intelligent tool skipping based on previous outputs:
- High RAG score (≥0.8) → Skip web search
- Critical admin violation → Skip agent reasoning, immediate block
- Relevant memory available → Skip RAG, use memory instead
Routing Hints: Context hints included in reasoning trace for transparency
Performance Impact: Leads to more sophisticated behavior and higher scores

Tool Output Schemas (Latest)

Strict JSON Schemas: Every tool returns validated JSON with consistent structure:
- RAG: {results: [...], top_score: float, latency_ms: int}
- Web: {results: [...], latency_ms: int}
- Admin: {violations: [...], severity: str, latency_ms: int}
- LLM: {text: str, tokens_used: int, latency_ms: int}
Automatic Validation: All tool outputs validated and formatted before use
Easier Debugging: Consistent structure makes debugging and monitoring simpler
Polished Responses: Schema-validated outputs ensure professional appearance

Cross-Encoder Re-ranking (Latest)

Two-Stage RAG Process:
- Initial vector search retrieves candidates
- Cross-encoder re-ranks top 10 results for accuracy
- Final filtering by threshold and limit
Model: Uses cross-encoder/ms-marco-MiniLM-L-6-v2 (very fast, production-ready)
Massive Accuracy Improvement: Re-ranking significantly improves relevance of search results
Seamless Integration: Works transparently with existing RAG search API

Context Engineering (Latest)

Anthropic-Inspired Strategies: Implements best practices from Anthropic's context engineering research:
- Compaction: High-fidelity summarization preserving architectural decisions, unresolved issues, and implementation details
- Tool Result Clearing: Safest form of compaction - removes large tool outputs once processed
- Structured Note-Taking: Tracks objectives (like Claude playing Pokémon), architectural decisions, and unresolved issues
- XML-Structured Prompts: All prompts use clear XML sections (<system>, <background_information>, <instructions>) for better model understanding
- Automatic Compression: Conversations compressed at 80% token threshold, targeting 60% after compression
- Just-in-Time Context: Selects only relevant memories and tools for each query
- Progressive Disclosure: Agents discover context incrementally through exploration
Benefits:
- Reduced token usage and costs
- Longer conversation support
- Better agent coherence across extended interactions
- Improved performance through structured context
Documentation: Context engineering features are integrated throughout the agent orchestrator and MCP server

UI Improvements

Modern Drag-and-Drop: Intuitive file upload with visual feedback
Enhanced Status Messages: Clear success/error messages with icons
Refresh Button in Table: Quick refresh directly from the Rule Set section
Better Visual Hierarchy: Improved spacing, colors, and layout
Gradio UI Enhancements:
- AI metadata displayed after document ingestion
- Latency predictions shown in reasoning trace
- Context-aware routing hints visualized
- Tool output schemas displayed in debug view

Key Technical Features

Tenant Isolation & Normalization

Strict tenant isolation enforced at database level with WHERE tenant_id = ... filters
Automatic tenant ID normalization handles whitespace and formatting differences
Documents can be listed and deleted consistently across different tenant_id formats
All operations validate tenant ownership before execution

RAG Search & Retrieval

Cross-Encoder Re-ranking: Two-stage retrieval process for massive accuracy improvement:
- First: Vector search retrieves top candidates using embeddings
- Then: Cross-encoder model (cross-encoder/ms-marco-MiniLM-L-6-v2) re-ranks top 10 results
- Final: Results filtered by threshold and limit applied
Optimized similarity threshold (default 0.3) for better recall of relevant documents
Intelligent fallback returns top result even if below threshold to ensure knowledge base content is accessible
Pattern-based tool selection automatically triggers RAG for admin questions, fact lookups, and internal knowledge queries
Response unwrapping ensures seamless integration between MCP server and orchestrator

MCP Server Architecture

Unified server running on a single port (default 8900) for all namespaced tools
Dual protocol support: Both MCP protocol (POST with JSON) and RESTful HTTP (GET/DELETE)
Response wrapping: Standardized response format with automatic unwrapping in clients
Error handling: Comprehensive error responses with detailed messages for debugging

UI Features

Knowledge Base Library

Visual Statistics: Real-time document counts and type distribution
Interactive Charts: Plotly pie charts for document type visualization
Advanced Search: Semantic search across all ingested documents with relevance scoring
Smart Filtering: Filter by document type (text, PDF, FAQ, link)
Bulk Operations: Delete individual documents or all documents at once
Auto-refresh: Lists automatically update after operations

Admin Analytics Dashboard

Statistics Cards: Key metrics displayed in visually appealing cards with icons
Tool Usage Visualization: Bar charts showing tool invocation counts and performance
Latency Metrics: Visual representation of tool response times
RAG Quality Analysis: Charts displaying search quality metrics (hits, scores, recall)
Detailed Tables: Comprehensive tool usage breakdown with success/error rates
Dark Theme: Modern UI with dark background and white text for better readability
Real-time Updates: Fetch latest analytics data with a single click

Acknowledgments

Built with Model Context Protocol (MCP)
Powered by Gradio for the interface
Visualizations created with Plotly
Backend built with FastAPI
Analytics and governance features inspired by enterprise AI platform requirements

Made with ❤️ for the MCP Hackathon

IntegraChat: Enterprise-Grade MCP Autonomous Agent Platform

⬆ Back to Top

IntegraChat — Enterprise MCP Autonomous Agent Platform

📋 Table of Contents

Overview

🚀 Quick Start

Windows Users

Manual Setup

Features

Core Capabilities

Enterprise Features

Conversation Memory System

Role-Based Access Control (RBAC)

Roles

Permission Matrix

Backend RBAC

Using RBAC

Real-Time Visualization Features

1. Reasoning Path Visualizer

2. Tool Invocation Timeline

3. Tenant Activity Heatmap

Installation & Setup

Prerequisites

Step-by-Step Installation

Usage

Gradio Interface (app.py)

1. Chat 💬

2. Document Ingestion 📚

3. Knowledge Base Library 📖

4. Admin Analytics 📊

5. Admin Rules & Compliance 🛡️

API Endpoints

Agent Endpoints

RAG Endpoints

Admin & Governance Endpoints

Analytics Endpoints

Visualization Features

1. Real-Time Reasoning Visualizer

2. Tool Invocation Timeline

3. Live Tenant Heatmap

Request Headers

Example Request

Architecture

System Overview

Enterprise-Grade Features

Data Storage Architecture

Supabase (Production/Preferred)

SQLite (Development Fallback)

Supabase Setup & Migration

Quick Setup

Troubleshooting

Common Issues

Backend Not Starting

MCP Server Connection Errors

RAG Search Not Returning Results

Supabase Configuration Issues

LLM Connection Errors

Document Ingestion Failures

Document Display Issues

Rule Addition Timeouts

Rule Deletion Issues

Tenant Isolation Issues

Getting Help

Testing & Diagnostics

Demo Video

Demo Video

Social Media

Team Member(s)

License

Technical Stack

Backend

Frontend

Data Storage

File Processing

Communication

Recent Enhancements

UI & UX Improvements (Latest)

Role Propagation & Permission Handling (Latest)

Admin Rules System (Latest)

Conversation Memory System (Latest)

AI-Generated KB Metadata & Advanced RAG (Latest)

Per-Tool Latency Prediction & Context-Aware Routing (Latest)

Gradio Interface (`app.py`)