Spaces:
Sleeping
Sleeping
| title: IntegraChat | |
| emoji: 🤖 | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: "4.20.0" | |
| app_file: app.py | |
| pinned: false | |
| # IntegraChat — Enterprise MCP Autonomous Agent Platform | |
| **Track:** MCP in Action | |
| **Category:** Enterprise | |
| **Tag:** `mcp-in-action-track-enterprise` | |
| --- | |
| ## 📋 Table of Contents | |
| - [Overview](#overview) | |
| - [Quick Start](#quick-start) | |
| - [Features](#features) | |
| - [Conversation Memory System](#conversation-memory-system) | |
| - [Role-Based Access Control (RBAC)](#role-based-access-control-rbac) | |
| - [Installation & Setup](#installation--setup) | |
| - [Usage](#usage) | |
| - [API Endpoints](#api-endpoints) | |
| - [Architecture](#architecture) | |
| - [Supabase Setup & Migration](#supabase-setup--migration) | |
| - [Troubleshooting](#troubleshooting) | |
| - [Testing & Diagnostics](#testing--diagnostics) | |
| - [Technical Stack](#technical-stack) | |
| - [License](#license) | |
| --- | |
| ## Overview | |
| **IntegraChat** is an enterprise-grade, multi-tenant AI platform that demonstrates the full capabilities of the **Model Context Protocol (MCP)** in a production-style environment. Built with enterprise governance and observability in mind, IntegraChat combines autonomous tool-using agents, RAG retrieval, live web search, and admin compliance under strict tenant isolation. | |
| This platform showcases how MCP can power intelligent, governed, multi-tenant AI systems with real-time analytics, regex-based red-flag detection, and comprehensive tool orchestration. | |
| --- | |
| ## 🚀 Quick Start | |
| ### Windows Users | |
| ```bash | |
| # 1. Install dependencies | |
| pip install -r requirements.txt | |
| # 2. Configure environment (copy and edit .env) | |
| cp env.example .env | |
| # Edit .env with your credentials (Supabase, LLM, etc.) | |
| # 3. Start all services | |
| start.bat | |
| ``` | |
| ### Manual Setup | |
| ```bash | |
| # 1. Install dependencies | |
| pip install -r requirements.txt | |
| # 2. Configure environment | |
| cp env.example .env | |
| # Edit .env with your credentials | |
| # 3. Start FastAPI backend (Terminal 1) | |
| uvicorn backend.api.main:app --port 8000 --reload | |
| # 4. Start unified MCP server (Terminal 2) | |
| python backend/mcp_server/server.py | |
| # 5. Start Gradio UI (Terminal 3) | |
| python app.py | |
| ``` | |
| Then access: | |
| - **Gradio UI**: `http://localhost:7860` | |
| - **FastAPI Docs**: `http://localhost:8000/docs` | |
| > **Security Note:** REST requests that hit protected endpoints must include both `x-tenant-id` and `x-user-role` headers. Roles (`viewer`, `editor`, `admin`, `owner`) determine which actions—such as document ingestion, rule uploads, or analytics access—the caller may perform. | |
| --- | |
| ## Features | |
| ### Core Capabilities | |
| - 🤖 **Autonomous Multi-Step MCP Agents** – Intelligent tool-aware agent that plans and executes multi-step workflows across RAG, Web, Admin, and LLM tools with short-term conversation memory | |
| - 💭 **Short-Term Conversation Memory** – Automatic memory system that stores the last N tool outputs per session with configurable expiration (default: 10 outputs, 15 minutes TTL). Memory is keyed by session_id (not tenant_id) for safety, enabling better context awareness in multi-step workflows. Memory is automatically injected into tool payloads and cleared on session end. | |
| - 📚 **Enhanced Knowledge Base Management** – Upload raw text, URLs, or documents (PDF/DOCX/TXT/MD) with rich metadata (source URL, timestamp, document type) and optimized chunking (400-600 tokens) | |
| - 🤖 **AI-Generated KB Metadata** – Automatic extraction of title, summary, tags, topics, date, and quality score during document ingestion. LLM-powered with intelligent fallback when unavailable - uses keyword extraction and pattern matching to provide useful metadata even during timeouts | |
| - 🔍 **Optimized RAG Search with Cross-Encoder Re-ranking** – Two-stage retrieval: initial vector search followed by cross-encoder re-ranking of top candidates using `cross-encoder/ms-marco-MiniLM-L-6-v2` for massive accuracy improvement. Semantic search with configurable similarity threshold (default 0.3) for better recall | |
| - ⚡ **Per-Tool Latency Prediction** – Agent estimates expected latency before choosing tools (RAG: 60-120ms, Web: 400-1800ms, Admin: <20ms) to optimize tool selection and choose the fastest path | |
| - 🧠 **Context-Aware MCP Routing** – Intelligent tool selection based on previous outputs: skip web search if RAG returns high score (≥0.8), skip agent reasoning for critical admin violations, skip RAG if relevant memory already available. Leads to more sophisticated behavior and higher scores | |
| - 📋 **Tool Output Schemas** – Every tool returns strict JSON type schemas for easier debugging, cleaner reasoning, and more polished responses. Automatic schema validation and formatting | |
| - 🗑️ **Document Management** – Delete individual documents or bulk delete all documents for a tenant with confirmation dialogs | |
| - 🛡️ **Enterprise Admin Governance** – Advanced rule management system with: | |
| - Regex-based red-flag pattern matching with severity levels (low/medium/high/critical) | |
| - Automatic admin alerts for violations | |
| - **LLM-Enhanced Rules**: Rules are automatically analyzed and enhanced to identify edge cases, improve regex patterns, and suggest appropriate severity levels | |
| - **LLM-Guided Rule Explanations**: Automatic generation of human-readable explanations, concrete examples, and missing pattern suggestions. Includes intelligent fallback when LLM is unavailable - uses keyword extraction to provide useful explanations even during timeouts | |
| - **File Upload Support**: Upload rules from TXT, PDF, DOC, or DOCX files with drag-and-drop interface | |
| - **Chunk Processing**: Large rule sets processed in manageable chunks (5 rules at a time) to prevent timeouts | |
| - **Rule-Based Behavior Control**: Rules checked FIRST - brief response rules return quick answers, blocking rules prevent requests | |
| - **Comment Filtering**: Comment lines (starting with #) automatically ignored when uploading rules | |
| - **Supabase Integration**: Rules stored in Supabase for production scalability (with SQLite fallback) | |
| - 📊 **Comprehensive Analytics & Observability** – Full tenant-level analytics logging with Supabase backend (SQLite fallback for local dev): | |
| - Tool usage breakdown (RAG, Web, Admin, LLM) with latency and token tracking | |
| - RAG recall/precision indicators (average hits, scores, top scores) | |
| - Per-tenant query volume and active users | |
| - Red-flag violations with timestamps and confidence scores | |
| - LLM token logs and latency metrics | |
| - **Real-Time Visualizations**: Reasoning path visualizer, tool invocation timeline, and tenant activity heatmap | |
| - 🌐 **Live Web Search** – Google Programmable Search (Custom Search API) with tenant-aware MCP tooling | |
| - 🏢 **Multi-Tenant Isolation** – Complete tenant isolation with centralized tenant ID management; backend enforces strict isolation for chat, ingestion, and admin ops | |
| - 🔐 **Fine-Grained Role-Based Access Control (RBAC)** – Four-tier role system (viewer, editor, admin, owner) with backend permission enforcement | |
| - 🔄 **Intelligent Multi-Tool Orchestration** – MCP agent orchestrator autonomously selects optimal tool chains (RAG + Web + LLM, etc.) based on query intent, context, latency predictions, and previous tool outputs. Context-aware routing enables sophisticated tool skipping for efficiency | |
| - ⚡ **Robust Error Handling** – Structured error responses, retry mechanisms, and graceful fallbacks (e.g., if RAG fails → fallback to LLM-only) | |
| - 📡 **Streaming Responses** – Chat responses stream character-by-character using Server-Sent Events (SSE) for real-time user experience | |
| - 🎯 **Rule-First Processing** – Admin rules checked before intent classification - rules can trigger brief responses or block requests entirely | |
| - 🧠 **Advanced Context Engineering** – Implements Anthropic's context engineering strategies: | |
| - **High-Fidelity Compaction**: Automatically compresses conversations at 80% token threshold, preserving architectural decisions and unresolved issues | |
| - **Tool Result Clearing**: Safest form of compaction - removes large tool outputs while keeping metadata | |
| - **Structured Note-Taking**: Tracks objectives, architectural decisions, and unresolved issues outside context window | |
| - **XML-Structured Prompts**: All prompts use clear XML sections for better model understanding | |
| - **Just-in-Time Context Loading**: Selects only relevant memories and tools for each query | |
| - **Progressive Disclosure**: Agents discover context incrementally through exploration | |
| ### Enterprise Features | |
| - 🔍 **Regex-Based Red-Flag Detection** – Support for complex regex patterns with keyword fallback and semantic scoring | |
| - 🤖 **LLM-Enhanced Rule Management** – Rules automatically enhanced by LLM to identify edge cases, improve patterns, and suggest severity levels. Includes intelligent fallback explanations when LLM is unavailable - uses keyword extraction to generate useful explanations, examples, and pattern suggestions even during timeouts | |
| - 📄 **File Upload & Drag-and-Drop** – Upload rules from files (TXT, PDF, DOC, DOCX) with intuitive drag-and-drop interface | |
| - ⚡ **Chunk-Wise Processing** – Large rule sets processed in chunks to prevent timeouts and ensure reliable processing | |
| - 📈 **Real-Time Analytics Dashboard** – Per-tenant analytics with configurable time windows (7, 30, 90 days) | |
| - 🛠️ **Admin API Endpoints** – `/admin/violations`, `/admin/tools/logs`, `/admin/tenants` for comprehensive governance | |
| - 🧠 **Agent Debug & Planning** – `/agent/debug` and `/agent/plan` endpoints for observability and tool selection inspection | |
| - 💾 **Persistent Analytics Storage** – Supabase-backed analytics store (with automatic SQLite fallback) for fast, multi-tenant queries | |
| - 🗄️ **Supabase Integration** – Production-ready Supabase support for admin rules with automatic table creation | |
| - 📈 **Real-Time Visualization Components** – Interactive visualizations for agent reasoning, tool execution, and tenant activity: | |
| - **Reasoning Path Visualizer**: Step-by-step visualization of agent decision-making with animated progression | |
| - **Tool Invocation Timeline**: Visual timeline showing tool execution order, latency, and result counts | |
| - **Tenant Activity Heatmap**: Query activity heatmap and per-tool usage trends over time | |
| ### Conversation Memory System | |
| IntegraChat includes a **short-term conversation memory** system that enhances multi-step workflows by maintaining context across tool calls: | |
| - **Automatic Storage**: Every tool output is automatically stored in memory for the session | |
| - **Bounded Size**: Keeps only the last N tool outputs (configurable via `MCP_MEMORY_MAX_ITEMS`, default: 10) | |
| - **Auto-Expiration**: Entries automatically expire after a configurable TTL (via `MCP_MEMORY_TTL_SECONDS`, default: 900 seconds / 15 minutes) | |
| - **Session-Based**: Memory is keyed by `session_id` (not `tenant_id`) for safety and isolation | |
| - **Automatic Injection**: Recent memory is automatically injected into tool payloads as a `memory` field for multi-step workflows | |
| - **Session Clearing**: Memory can be explicitly cleared by sending `end_session: true` or `endSession: true` in the payload | |
| **Usage Example:** | |
| ```json | |
| { | |
| "tenant_id": "acme", | |
| "session_id": "chat-abc-123", | |
| "query": "Search for X" | |
| } | |
| ``` | |
| Subsequent tool calls with the same `session_id` will receive a `memory` field containing recent tool outputs, enabling tools to make context-aware decisions in multi-step workflows. | |
| **Configuration:** | |
| - `MCP_MEMORY_MAX_ITEMS`: Maximum number of tool outputs to keep per session (default: 10) | |
| - `MCP_MEMORY_TTL_SECONDS`: Time-to-live for memory entries in seconds (default: 900) | |
| --- | |
| ## Role-Based Access Control (RBAC) | |
| IntegraChat implements fine-grained role-based access control (RBAC) for backend API endpoints. This ensures that users can only access features appropriate for their role level. | |
| ### Roles | |
| The system supports four roles with increasing privileges: | |
| 1. **viewer** (default) - Basic read-only access | |
| - Can use chat functionality | |
| - Cannot ingest documents | |
| - Cannot delete documents | |
| - Cannot view analytics | |
| - Cannot manage admin rules | |
| 2. **editor** - Content management access | |
| - Can use chat functionality | |
| - ✅ Can ingest documents (upload, paste, URLs, files) | |
| - ❌ Cannot delete documents | |
| - ❌ Cannot view analytics | |
| - ❌ Cannot manage admin rules | |
| 3. **admin** - Administrative access | |
| - Can use chat functionality | |
| - ✅ Can ingest documents | |
| - ✅ Can delete documents | |
| - ✅ Can view analytics | |
| - ✅ Can manage admin rules | |
| 4. **owner** - Full system access | |
| - Same permissions as admin (highest privilege level) | |
| ### Permission Matrix | |
| | Action | viewer | editor | admin | owner | | |
| |--------|--------|--------|-------|-------| | |
| | Chat Bot | ✅ | ✅ | ✅ | ✅ | | |
| | Ingest Documents | ❌ | ✅ | ✅ | ✅ | | |
| | Delete Documents | ❌ | ❌ | ✅ | ✅ | | |
| | View Analytics | ✅ | ✅ | ✅ | ✅ | | |
| | Manage Rules | ❌ | ❌ | ✅ | ✅ | | |
| ### Backend RBAC | |
| Backend API endpoints enforce RBAC through the `x-user-role` header: | |
| ```python | |
| # Permission matrix in backend/mcp_server/common/access_control.py | |
| PERMISSIONS = { | |
| "manage_rules": {"owner", "admin"}, | |
| "ingest_documents": {"owner", "admin", "editor"}, | |
| "delete_documents": {"owner", "admin"}, | |
| "view_analytics": {"owner", "admin"}, | |
| } | |
| ``` | |
| **Protected Endpoints:** | |
| - `/admin/rules` - Requires `admin` or `owner` role | |
| - `/rag/ingest*` - Requires `editor`, `admin`, or `owner` role | |
| - `/rag/delete*` - Requires `admin` or `owner` role | |
| - `/analytics/*` - All roles can view (viewer, editor, admin, owner) | |
| **Role Propagation:** | |
| The user role is automatically propagated through the entire request pipeline: | |
| 1. Client sends `x-user-role` header | |
| 2. Backend API route receives and validates role | |
| 3. Role is passed to service layer (`process_ingestion()`, etc.) | |
| 4. Service layer passes role to MCP clients | |
| 5. MCP clients include role in payload to MCP server | |
| 6. MCP server extracts role and enforces permissions | |
| **Example Request:** | |
| ```bash | |
| curl -X POST "http://localhost:8000/admin/rules" \ | |
| -H "Content-Type: application/json" \ | |
| -H "x-tenant-id: tenant123" \ | |
| -H "x-user-role: admin" \ | |
| -d '{"rule": "Do not share passwords"}' | |
| ``` | |
| If the role lacks permission, the API returns `403 Forbidden` with a descriptive error message that includes: | |
| - Which role was used | |
| - Which roles are allowed for the action | |
| - Instructions to change role in the UI | |
| ### Using RBAC | |
| 1. **Set Role**: Include `x-user-role` header in API requests with one of: `viewer`, `editor`, `admin`, or `owner` | |
| 2. **Verify Permissions**: Backend enforces role-based access automatically | |
| 3. **Error Handling**: API returns `403 Forbidden` with clear error messages when role lacks required permissions | |
| --- | |
| ## Real-Time Visualization Features | |
| IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity: | |
| ### 1. Reasoning Path Visualizer | |
| - **What it shows**: Step-by-step visualization of how the agent makes decisions | |
| - **Features**: | |
| - Animated progression through reasoning steps | |
| - Status indicators (pending, running, completed, error) | |
| - Detailed metrics per step (latency, hit counts, token estimates) | |
| - Visual icons for each step type | |
| - **Where to find it**: | |
| - Gradio app: Debug & Reasoning tab | |
| - **Data source**: `reasoning_trace` from agent responses | |
| ### 2. Tool Invocation Timeline | |
| - **What it shows**: Visual timeline of all tool executions during an agent interaction | |
| - **Features**: | |
| - Color-coded bars showing tool status (success/error) | |
| - Latency visualization per tool | |
| - Result count badges | |
| - Summary statistics (total tools, total time, average latency) | |
| - **Where to find it**: | |
| - Gradio app: Debug & Reasoning tab | |
| - **Data source**: `tool_traces` from agent responses | |
| ### 3. Tenant Activity Heatmap | |
| - **What it shows**: Query activity patterns and tool usage trends over time | |
| - **Features**: | |
| - Hour-by-hour, day-by-day activity heatmap | |
| - Color intensity based on activity level | |
| - Per-tool usage trends with bar charts | |
| - Trend indicators (up/down/stable) | |
| - **Where to find it**: | |
| - Gradio app: Admin Analytics tab | |
| - Configurable time window (default: 7 days) | |
| - **Data source**: `/analytics/activity` and `/analytics/tool-usage` endpoints | |
| **Access**: All visualization features are available to all roles (viewer, editor, admin, owner). | |
| --- | |
| ## Installation & Setup | |
| ### Prerequisites | |
| - **Python 3.10+** with pip | |
| - **PostgreSQL** (with pgvector extension) or **Supabase** for RAG storage | |
| - **Supabase** (recommended) or SQLite for admin rules and analytics | |
| - **Ollama** (local) or **Groq API** credentials for LLM | |
| - **Google Custom Search API** (optional, for web search): | |
| - Enable Custom Search API in [Google Cloud Console](https://console.cloud.google.com/) | |
| - Create API key → set as `GOOGLE_SEARCH_API_KEY` in `.env` | |
| - Create Programmable Search Engine → set ID as `GOOGLE_SEARCH_CX_ID` in `.env` | |
| ### Step-by-Step Installation | |
| 1. **Clone and navigate to the project**: | |
| ```bash | |
| cd IntegraChat | |
| ``` | |
| 2. **Create and activate virtual environment** (recommended): | |
| ```bash | |
| # Windows | |
| python -m venv venv | |
| venv\Scripts\activate | |
| # Linux/Mac | |
| python3 -m venv venv | |
| source venv/bin/activate | |
| ``` | |
| 3. **Install Python dependencies**: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 4. **Configure environment variables**: | |
| ```bash | |
| cp env.example .env | |
| # Edit .env with your credentials: | |
| # - SUPABASE_URL and SUPABASE_SERVICE_KEY (for production storage) | |
| # - POSTGRESQL_URL (for RAG vector database) | |
| # - OLLAMA_URL/OLLAMA_MODEL or GROQ_API_KEY (for LLM) | |
| # - GOOGLE_SEARCH_API_KEY and GOOGLE_SEARCH_CX_ID (optional, for web search) | |
| ``` | |
| 5. **Set up Supabase** (recommended for production): | |
| - Create a Supabase project at [supabase.com](https://supabase.com) | |
| - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor | |
| - Run `supabase_analytics_tables.sql` in Supabase SQL Editor | |
| - Copy your project URL and service role key to `.env` | |
| - Verify setup: `python verify_supabase_setup.py` | |
| 6. **Start the services**: | |
| **Option A: Windows Quick Start** (recommended for Windows): | |
| ```bash | |
| start.bat | |
| ``` | |
| This automatically starts: | |
| - FastAPI backend on port 8000 | |
| - Unified MCP server on port 8900 | |
| **Option B: Manual Start**: | |
| ```bash | |
| # Terminal 1: FastAPI backend | |
| uvicorn backend.api.main:app --port 8000 --reload | |
| # Terminal 2: Unified MCP server | |
| python backend/mcp_server/server.py | |
| ``` | |
| 7. **Launch the UI**: | |
| **Gradio Interface** (full-featured): | |
| ```bash | |
| python app.py | |
| ``` | |
| Access at `http://localhost:7860` | |
| ## Usage | |
| ### Gradio Interface (`app.py`) | |
| The Gradio UI provides a comprehensive interface with five main tabs: | |
| #### 1. **Chat** 💬 | |
| - Enter your Tenant ID and start chatting with the MCP-powered agent | |
| - Real-time streaming responses (word-by-word using SSE) | |
| - Autonomous tool orchestration (RAG, Web, Admin, LLM) | |
| - Multi-step planning with memory of previous tool outputs | |
| #### 2. **Document Ingestion** 📚 | |
| - **Raw Text**: Paste text directly | |
| - **URL**: Ingest content from web URLs | |
| - **File Upload**: Upload PDF, DOCX, TXT, or Markdown files | |
| - Rich metadata support (filename, URL, document ID, custom JSON) | |
| - View and manage ingested documents | |
| #### 3. **Knowledge Base Library** 📖 | |
| - **Statistics Dashboard**: Visual cards showing document counts by type | |
| - **Interactive Charts**: Plotly pie chart for document type distribution | |
| - **Semantic Search**: Search knowledge base with relevance scoring | |
| - **Type Filtering**: Filter by document type (text, PDF, FAQ, link) | |
| - **Document Management**: View, preview, and delete documents | |
| - **Auto-refresh**: Lists update automatically after operations | |
| #### 4. **Admin Analytics** 📊 | |
| - **Statistics Cards**: Total queries, active users, red flags, RAG searches | |
| - **Interactive Bar Charts**: | |
| - Tool Usage Count (RAG, Web, Admin, LLM) | |
| - Average Tool Latency (performance metrics) | |
| - RAG Quality Metrics (hits, scores, recall indicators) | |
| - **Tool Usage Table**: Detailed performance breakdown | |
| - **Formatted Summary**: Key metrics in easy-to-read format | |
| - Click "🔄 Fetch Analytics Snapshot" to load latest data | |
| #### 5. **Admin Rules & Compliance** 🛡️ | |
| - **Text Input**: Paste rules one per line (comments starting with # are ignored) | |
| - **File Upload**: Upload rules from TXT, PDF, DOC, or DOCX files | |
| - **LLM Enhancement**: Automatic rule enhancement (edge cases, pattern improvements, severity suggestions) | |
| - **Chunk Processing**: Large rule sets processed in chunks (5 at a time) | |
| - **Rule-Based Behavior**: Rules checked FIRST - brief responses or blocking based on severity | |
| - **Streaming Responses**: Real-time word-by-word streaming | |
| - **Refresh Button**: Update rules table directly | |
| > **💡 Tip:** Every action requires a Tenant ID. The Tenant ID persists across page refreshes and is managed centrally. | |
| --- | |
| ## API Endpoints | |
| All endpoints are served by the FastAPI backend at `http://localhost:8000`. Most endpoints require the `x-tenant-id` header for tenant isolation. | |
| > **📖 API Documentation**: Interactive Swagger docs available at `http://localhost:8000/docs` when the backend is running. | |
| ### Agent Endpoints | |
| | Method | Endpoint | Description | | |
| | --- | --- | --- | | |
| | `POST` | `/agent/message` | Main chat endpoint with `tenant_id`, `message`, optional history | | |
| | `POST` | `/agent/message/stream` | Streaming chat endpoint using Server-Sent Events (SSE). Returns tokens word-by-word | | |
| | `POST` | `/agent/debug` | Detailed debugging info: reasoning trace, tool selection, intent classification | | |
| | `POST` | `/agent/plan` | Tool selection plan without execution (intent, tool scores, planned steps) | | |
| ### RAG Endpoints | |
| | Method | Endpoint | Description | | |
| | --- | --- | --- | | |
| | `POST` | `/rag/ingest-document` | Ingest document with `source_type`, `content`, metadata. Supports raw text, URLs, PDFs, DOCX, TXT, Markdown | | |
| | `POST` | `/rag/ingest-file` | Multipart file upload (PDF/DOCX/TXT/MD) with `x-tenant-id` header | | |
| | `GET` | `/rag/list?tenant_id={id}&limit={n}&offset={n}` | List all documents for a tenant with pagination | | |
| | `DELETE` | `/rag/delete/{document_id}?tenant_id={id}` | Delete a specific document by ID | | |
| | `DELETE` | `/rag/delete-all?tenant_id={id}` | Delete all documents for a tenant | | |
| **Note:** RAG endpoints support both `x-tenant-id` header and `tenant_id` query parameter. | |
| ### Admin & Governance Endpoints | |
| | Method | Endpoint | Description | | |
| | --- | --- | --- | | |
| | `GET` | `/admin/rules?detailed=true` | Get all rules (use `detailed=true` for regex/severity metadata) | | |
| | `POST` | `/admin/rules?enhance=true` | Add single rule with optional `pattern` (regex), `severity`, `description`. Set `enhance=true` for LLM enhancement | | |
| | `POST` | `/admin/rules/bulk?enhance=true` | Add multiple rules at once (processed in chunks of 5). LLM enhancement applied automatically | | |
| | `POST` | `/admin/rules/upload-file?enhance=true` | Upload rules from file (TXT, PDF, DOC, DOCX). Text extracted server-side | | |
| | `DELETE` | `/admin/rules/{rule}` | Delete a specific rule | | |
| | `GET` | `/admin/violations?days=30&limit=50` | Get red-flag violations with timestamps and confidence scores | | |
| | `GET` | `/admin/tools/logs?tool_name=rag&days=7` | Get detailed tool usage logs with latency and token counts | | |
| | `GET/POST/DELETE` | `/admin/tenants` | Tenant management endpoints | | |
| | `POST` | `/admin/setup/table` | Create admin_rules table in Supabase if it doesn't exist | | |
| ### Analytics Endpoints | |
| | Method | Endpoint | Description | | |
| | --- | --- | --- | | |
| | `GET` | `/analytics/overview?days=30` | Comprehensive analytics: total queries, tool usage, red-flag count, RAG quality | | |
| | `GET` | `/analytics/tool-usage?days=30` | Detailed tool usage stats: counts, latency, tokens, success/error rates | | |
| | `GET` | `/analytics/redflags?limit=50&days=30` | Recent red-flag violations for tenant | | |
| | `GET` | `/analytics/activity?days=30` | Tenant activity summary: queries, active users, last query timestamp | | |
| | `GET` | `/analytics/rag-quality?days=30` | RAG quality metrics: avg hits, scores, latency (recall/precision indicators) | | |
| ### Visualization Features | |
| IntegraChat includes three powerful visualization components that provide real-time insights into agent behavior and system activity: | |
| #### 1. Real-Time Reasoning Visualizer | |
| - **Location**: Debug tab (Gradio app) | |
| - **Features**: | |
| - Step-by-step visualization of agent reasoning path | |
| - Animated progression through reasoning steps | |
| - Status indicators (pending, running, completed, error) | |
| - Detailed metrics per step (latency, hit counts, token estimates) | |
| - Visual icons for each step type (admin rules check, RAG prefetch, tool selection, etc.) | |
| - **Data Source**: `reasoning_trace` from `/agent/message` or `/agent/debug` endpoints | |
| - **Usage**: Automatically appears in chat panel when agent responses include reasoning traces | |
| #### 2. Tool Invocation Timeline | |
| - **Location**: Debug tab (Gradio app) | |
| - **Features**: | |
| - Visual timeline showing tool execution order | |
| - Color-coded bars indicating tool status (success/error) | |
| - Latency visualization per tool | |
| - Result count badges | |
| - Summary statistics (total tools, total time, average latency) | |
| - **Data Source**: `tool_traces` from `/agent/message` or `/agent/debug` endpoints | |
| - **Usage**: Automatically appears in chat panel when agent responses include tool traces | |
| #### 3. Live Tenant Heatmap | |
| - **Location**: Analytics page (`/analytics`) | |
| - **Features**: | |
| - Query activity heatmap (hour-by-hour, day-by-day visualization) | |
| - Color intensity based on activity level | |
| - Per-tool usage trends with bar charts | |
| - Trend indicators (up/down/stable) | |
| - Configurable time window (default: 7 days) | |
| - **Data Source**: `/analytics/activity` and `/analytics/tool-usage` endpoints | |
| - **Usage**: Navigate to Analytics page to view tenant activity patterns | |
| **Access**: All visualization features are available to all roles (viewer, editor, admin, owner). | |
| ### Request Headers | |
| Most endpoints require: | |
| - `x-tenant-id`: Tenant identifier for multi-tenant isolation | |
| - `x-user-role`: Caller role for RBAC enforcement (`viewer`, `editor`, `admin`, or `owner`) | |
| - **Important**: Role must be passed through the entire pipeline (UI → API → RAG Client → MCP Server) | |
| - Role is automatically propagated from the API request to backend API, then to RAG client, and finally to MCP server for permission checks | |
| - If ingestion fails with permission errors, verify the role is set correctly in the UI and check backend logs for role propagation debug messages | |
| - `Content-Type: application/json`: For POST requests with JSON payloads | |
| ### Example Request | |
| ```bash | |
| curl -X POST http://localhost:8000/agent/message \ | |
| -H "Content-Type: application/json" \ | |
| -H "x-tenant-id: tenant123" \ | |
| -d '{ | |
| "message": "What is our refund policy?", | |
| "tenant_id": "tenant123" | |
| }' | |
| ``` | |
| --- | |
| ## Architecture | |
| ### System Overview | |
| IntegraChat follows a modular architecture with clear separation of concerns: | |
| ``` | |
| ┌─────────────────┐ | |
| │ Frontend UI │ (Gradio) | |
| │ Port 7860 │ | |
| └────────┬────────┘ | |
| │ | |
| ▼ | |
| ┌─────────────────┐ | |
| │ FastAPI Backend│ (API Gateway) | |
| │ Port 8000 │ | |
| └────────┬────────┘ | |
| │ | |
| ├──► Unified MCP Server (Port 8900) | |
| │ ├── RAG Tools (search, ingest, list, delete) | |
| │ ├── Web Tools (search) | |
| │ └── Admin Tools (rules, violations) | |
| │ | |
| ├──► PostgreSQL/Supabase (RAG Vector Store) | |
| ├──► Supabase/SQLite (Rules & Analytics) | |
| └──► LLM Backend (Ollama/Groq) | |
| ``` | |
| ### Enterprise-Grade Features | |
| 1. **Autonomous Multi-Step Planning**: LLM-powered planning determines optimal tool sequences with short-term conversation memory that stores and injects previous tool outputs into subsequent tool calls for better context awareness. | |
| 2. **Regex-Based Governance**: Admin rules support regex patterns with fallback to keyword matching and semantic similarity scoring for flexible policy enforcement. | |
| 3. **Comprehensive Analytics**: All tool usage, RAG searches, LLM calls, and red-flag violations are logged with indexed queries for fast analytics retrieval. | |
| 4. **Enhanced RAG Pipeline**: Documents chunked optimally (400-600 tokens) and enriched with metadata (source URL, timestamp, document type) for better retrieval. | |
| 5. **Structured Error Handling**: All errors logged with context, with graceful fallbacks (e.g., RAG fails → LLM-only, web fails → skip web). | |
| ### Data Storage Architecture | |
| IntegraChat uses **dual-backend storage** with automatic fallback for production flexibility: | |
| #### Supabase (Production/Preferred) | |
| **When to use:** Production deployments, multi-user environments, scalable applications | |
| **Storage:** | |
| - `admin_rules` - Admin rules with regex patterns and severity levels | |
| - `tool_usage_events` - Tool invocation logs with latency and token tracking | |
| - `redflag_violations` - Red-flag violation events with timestamps | |
| - `rag_search_events` - RAG search metrics and quality indicators | |
| - `agent_query_events` - Agent query logs and analytics | |
| **Features:** | |
| - Row Level Security (RLS) for multi-tenant isolation | |
| - Automatic backups and scaling | |
| - Real-time capabilities | |
| - Production-ready infrastructure | |
| **Setup:** Configure `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` in `.env` | |
| #### SQLite (Development Fallback) | |
| **When to use:** Local development, testing, single-user scenarios | |
| **Storage:** | |
| - `data/admin_rules.db` - Admin rules (local file) | |
| - `data/analytics.db` - Analytics events (local file) | |
| **Features:** | |
| - Zero configuration required | |
| - Perfect for local development | |
| - Automatic fallback when Supabase not configured | |
| **Migration:** To migrate existing SQLite data to Supabase, refer to Supabase documentation for data migration strategies. | |
| --- | |
| ## Supabase Setup & Migration | |
| IntegraChat supports Supabase for production-ready storage of admin rules and analytics. Both `RulesStore` and `AnalyticsStore` automatically detect and use Supabase when credentials are available, falling back to SQLite for local development. | |
| ### Quick Setup | |
| 1. **Create Supabase tables**: | |
| - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor | |
| - Run `supabase_analytics_tables.sql` in Supabase SQL Editor | |
| 2. **Configure environment variables** in `.env`: | |
| ```env | |
| SUPABASE_URL=https://your-project-id.supabase.co | |
| SUPABASE_SERVICE_KEY=your_service_role_key_here | |
| ``` | |
| 3. **Verify setup**: Check that your Supabase project is accessible and tables are created correctly. | |
| --- | |
| ## Troubleshooting | |
| ### Common Issues | |
| #### Backend Not Starting | |
| - **Issue**: FastAPI backend fails to start | |
| - **Solution**: | |
| - Check if port 8000 is already in use: `netstat -ano | findstr :8000` (Windows) or `lsof -i :8000` (Linux/Mac) | |
| - Verify Python virtual environment is activated | |
| - Check `.env` file exists and has required variables | |
| - Review error logs for missing dependencies | |
| #### MCP Server Connection Errors | |
| - **Issue**: "Could not connect to MCP server" errors | |
| - **Solution**: | |
| - Ensure unified MCP server is running: `python backend/mcp_server/server.py` | |
| - Check MCP server is on port 8900 (default) | |
| - Verify `MCP_SERVER_ID` in `.env` matches server configuration | |
| - Check firewall settings if running on different machines | |
| #### RAG Search Not Returning Results | |
| - **Issue**: RAG searches return no results despite ingested documents | |
| - **Solution**: | |
| - Check similarity threshold (default 0.3) - try lowering to 0.2 or 0.1 | |
| - Verify documents exist: `GET /rag/list?tenant_id={id}` | |
| - Ensure tenant_id matches between ingestion and search | |
| - Check PostgreSQL/pgvector connection and vector extension | |
| - Review MCP server logs for search metrics | |
| #### Supabase Configuration Issues | |
| - **Issue**: Data still going to SQLite instead of Supabase | |
| - **Solution**: | |
| - Verify `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` in `.env` (no quotes, no spaces) | |
| - Use **service_role** key (not anon key) from Supabase Dashboard | |
| - Verify Supabase credentials in `.env` file | |
| - Ensure tables exist: run SQL scripts in Supabase SQL Editor | |
| - Check FastAPI startup logs for backend detection messages | |
| #### LLM Connection Errors | |
| - **Issue**: Agent responses fail with LLM errors | |
| - **Solution**: | |
| - For Ollama: Ensure Ollama is running (`ollama serve`) | |
| - Check `OLLAMA_URL` and `OLLAMA_MODEL` in `.env` | |
| - For Groq: Verify `GROQ_API_KEY` is set correctly | |
| - Check `LLM_BACKEND` setting (ollama or groq) | |
| - Test LLM connection: `curl http://localhost:11434/api/tags` (Ollama) | |
| #### Document Ingestion Failures | |
| - **Issue**: File uploads or document ingestion fails | |
| - **Solution**: | |
| - Check file size limits (default may be 10MB) | |
| - Verify file format is supported (PDF, DOCX, TXT, MD) | |
| - Ensure tenant_id is provided in request | |
| - **Check user role**: Ingestion requires `editor`, `admin`, or `owner` role. If you see "Permission Denied (403)", change your role in the UI dropdown (top right) from "viewer" to "editor", "admin", or "owner" | |
| - Verify `x-user-role` header is being sent correctly (check backend logs for debug messages) | |
| - Check backend logs for specific error messages | |
| - Verify PostgreSQL connection for RAG storage | |
| #### Document Display Issues | |
| - **Issue**: Document list shows `[object Object]` instead of document details | |
| - **Solution**: This has been fixed. Documents now display properly with: | |
| - Document ID (number) | |
| - Document Type (text, pdf, faq, link) | |
| - Preview (first 200 characters) | |
| - Length (character count) | |
| - Created date | |
| - **If still seeing issues**: Refresh the Knowledge Base Library tab | |
| #### Rule Addition Timeouts | |
| - **Issue**: "Chunk 1/1 timed out after 45s" when adding rules | |
| - **Solution**: | |
| - **Quick Fix**: Uncheck the "Enable LLM Enhancement" checkbox before adding rules - rules will be added immediately without LLM processing | |
| - **With Enhancement**: Keep checkbox checked but be patient - enhancement can take up to 180s for 5 rules (30s per rule) | |
| - **Best Practice**: Add rules in smaller batches (1-3 rules at a time) when using enhancement | |
| - **Note**: Enhancement is optional - you can always add rules quickly without it, then enhance them later if needed | |
| #### Rule Deletion Issues | |
| - **Issue**: "404 Not Found" when trying to delete a rule | |
| - **Solution**: You can now delete rules in two ways: | |
| - **By Number**: Enter the rule number (e.g., "1", "2", "3") as shown in the rules table | |
| - **By Text**: Enter the exact rule text as displayed in the rules table | |
| - **If rule not found**: Make sure you're entering the exact text or a valid rule number. Refresh the rules table to see current rules. | |
| #### Tenant Isolation Issues | |
| - **Issue**: Documents or data leaking between tenants | |
| - **Solution**: | |
| - Check database queries include `WHERE tenant_id = ...` filters | |
| - Verify tenant ID normalization is working correctly | |
| - Review database logs for tenant isolation | |
| ### Getting Help | |
| 1. **Check Logs**: Review FastAPI and MCP server logs for detailed error messages | |
| 2. **Run Diagnostics**: Use helper scripts in the Testing & Diagnostics section | |
| 3. **Verify Configuration**: Check `.env` file and Supabase connection | |
| 4. **Review Documentation**: See `backend/README.md` for backend-specific issues | |
| --- | |
| ## Testing & Diagnostics | |
| You can test the system by: | |
| - **API Testing**: Use the FastAPI interactive docs at `http://localhost:8000/docs` to test endpoints | |
| - **Database Inspection**: Connect directly to your PostgreSQL/Supabase instance to verify tenant isolation | |
| - **Log Monitoring**: Check FastAPI and MCP server logs for detailed error messages and debugging information | |
| > **Tip:** Ensure the Python virtual environment is active (`source venv/bin/activate` or `.\venv\Scripts\activate`) and that `.env` contains the MCP server URLs/LLM settings. | |
| --- | |
| ## Demo Video | |
| - ✅ **Prerequisites:** FastAPI backend plus all MCP servers (RAG/Web/Admin) running locally. | |
| - ✅ **What it checks:** | |
| 1. Direct database writes via the analytics and rules stores | |
| 2. CRUD over the `/admin/*` and `/analytics/*` endpoints | |
| 3. RAG ingestion and isolation by issuing queries as multiple tenants and ensuring secrets never leak across IDs | |
| - ✅ **Pass criteria:** At least 80 % of the sub-tests succeed (the RAG isolation test must pass for overall success). | |
| - `python check_rag_database.py` | |
| Provides a low-level inspection of the RAG datastore. It connects straight to the pgvector/Postgres instance, lists all tenant IDs, prints sample chunks, and runs `search_vectors()` directly to ensure the SQL `WHERE tenant_id = …` filter is behaving as expected. Use this script when diagnosing suspected cross-tenant leakage or when seeding demo data. | |
| - `python verify_supabase_setup.py` | |
| Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using. Displays any missing configuration and provides a summary of where data will be saved. | |
| - `python check_supabase_rules.py` | |
| Checks Supabase admin rules configuration and RLS policies. Validates that rules can be read/written correctly. | |
| - `python migrate_sqlite_to_supabase.py` | |
| One-shot migration script that copies existing SQLite data (admin rules + analytics) to Supabase. Supports both PostgreSQL direct connection and Supabase REST API methods. | |
| - `python test_manual.py` | |
| The existing manual test runner remains useful for smoke-testing analytics logging, admin rule CRUD, and API response codes. Run it whenever you adjust schemas or update MCP endpoints. | |
| > **Tip:** Ensure the Python virtual environment is active (`source venv/bin/activate` or `.\venv\Scripts\activate`) and that `.env` contains the MCP server URLs/LLM settings. | |
| --- | |
| ## Demo Video | |
| 🎥 **[Demo Video Placeholder]** - Coming soon! | |
| Watch how IntegraChat uses MCP to power autonomous agents with multi-tool selection, RAG retrieval, and enterprise governance. | |
| --- | |
| ## Social Media | |
| 📱 **[Social Media Post Placeholder]** - Coming soon! | |
| Follow us for updates and demos of IntegraChat in action! | |
| --- | |
| ## Team Member(s) | |
| - **Your Name Here** - Developer & MCP Enthusiast | |
| --- | |
| ## License | |
| This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. | |
| --- | |
| ## Technical Stack | |
| ### Backend | |
| - **Framework**: FastAPI with async/await for high-performance MCP orchestration | |
| - **MCP Server**: Unified MCP server (port 8900) exposing all tools via namespaces | |
| - **API**: RESTful API with Server-Sent Events (SSE) for streaming responses | |
| - **LLM Integration**: | |
| - Ollama (local, default) - `http://localhost:11434` | |
| - Groq (cloud) - via API key | |
| - Configurable backend with streaming support | |
| ### Frontend | |
| - **Gradio UI**: Full-featured interface with Plotly visualizations (`app.py`) | |
| - **UI Libraries**: | |
| - Plotly for interactive charts and visualizations | |
| ### Data Storage | |
| - **RAG Vector Store**: PostgreSQL with pgvector extension (via Supabase or direct connection) | |
| - **Analytics**: Supabase (production) or SQLite (development) with indexed queries | |
| - **Rules Storage**: Supabase (production) or SQLite (development) with automatic fallback | |
| - **Database**: PostgreSQL for RAG embeddings, Supabase/SQLite for analytics and rules | |
| ### File Processing | |
| - **Supported Formats**: TXT, PDF, DOC, DOCX, Markdown | |
| - **Libraries**: PyPDF2, python-docx for server-side text extraction | |
| - **Metadata**: Rich metadata support (source URL, timestamp, document type) | |
| ### Communication | |
| - **Streaming**: Server-Sent Events (SSE) for real-time word-by-word response streaming | |
| - **Protocol**: Model Context Protocol (MCP) for tool communication | |
| - **HTTP**: RESTful endpoints with JSON payloads | |
| ## Recent Enhancements | |
| ### UI & UX Improvements (Latest) | |
| - **Document Display Fix**: Fixed document list showing `[object Object]` - now properly displays document ID, type, preview, length, and creation date in a formatted table | |
| - **Rule Deletion Enhancement**: Can now delete rules by entering either: | |
| - Rule number (e.g., "1", "2", "3") - automatically finds the corresponding rule | |
| - Full rule text - deletes the exact matching rule | |
| - **LLM Enhancement Toggle**: Added checkbox to enable/disable LLM enhancement when adding rules: | |
| - **Quick Add**: Uncheck to add rules immediately without LLM processing (no timeout issues) | |
| - **Enhanced Add**: Check to get better patterns, explanations, and examples (takes longer but higher quality) | |
| - **Improved Timeouts**: Increased timeout for rule enhancement from 45s to 180s to handle multiple rules properly | |
| - **Better Error Messages**: Clearer error messages for rule deletion, document operations, and permission errors | |
| ### Role Propagation & Permission Handling (Latest) | |
| - **Fixed Role Propagation**: User role (`viewer`, `editor`, `admin`, `owner`) is now properly passed through the entire ingestion pipeline: | |
| - UI sends role in `x-user-role` header | |
| - Backend API route receives and validates role | |
| - Role is passed to `process_ingestion()` service | |
| - RAG client includes role in payload to MCP server | |
| - MCP server uses role for permission checks | |
| - **Improved Error Handling**: Permission errors (403 Forbidden) now return clear, actionable error messages: | |
| - Clear indication when role lacks required permissions | |
| - Guidance on which roles can perform specific actions | |
| - Instructions to change role in UI dropdown | |
| - **Debug Logging**: Added comprehensive debug logging to trace role values through the pipeline for troubleshooting | |
| - **Admin Question Handling**: Fixed "who is the admin" type questions to use RAG from knowledge base instead of generic LLM responses | |
| ### Admin Rules System (Latest) | |
| - **File Upload Support**: Upload rules from TXT, PDF, DOC, DOCX files with drag-and-drop interface | |
| - **LLM Enhancement Toggle**: Optional LLM enhancement with checkbox control: | |
| - **Quick Add Mode**: Uncheck to add rules immediately without LLM processing (no timeouts) | |
| - **Enhanced Mode**: Check to get better patterns, explanations, examples, and edge case detection | |
| - **LLM Enhancement**: When enabled, automatic rule enhancement identifies edge cases, improves regex patterns, and suggests severity levels | |
| - **Intelligent Fallback Explanations**: When LLM enhancement times out or fails, the system automatically generates basic explanations using keyword extraction, providing useful examples and pattern suggestions without requiring LLM availability | |
| - **Chunk Processing**: Large rule sets processed in chunks of 5 to prevent timeouts (handles 100+ rules efficiently) | |
| - **Enhanced Timeouts**: Increased timeout from 45s to 180s per chunk to accommodate LLM processing | |
| - **Flexible Rule Deletion**: Delete rules by entering either rule number (e.g., "1") or full rule text | |
| - **Comment Filtering**: Comment lines (starting with #) automatically ignored when uploading rules | |
| - **Rule-First Processing**: Admin rules checked before intent classification - enables behavior control (brief responses vs blocking) | |
| - **Supabase Integration**: Production-ready Supabase support with automatic table creation | |
| - **Streaming Responses**: Word-by-word streaming for chat responses using Server-Sent Events (SSE) | |
| ### Conversation Memory System (Latest) | |
| - **Short-Term Memory**: Automatic storage of tool outputs per session with configurable size limits and TTL | |
| - **Session-Based Isolation**: Memory keyed by session_id (not tenant_id) for safety | |
| - **Automatic Injection**: Recent memory automatically injected into tool payloads for multi-step workflows | |
| - **Auto-Expiration**: Memory entries expire after configurable TTL (default: 15 minutes) | |
| - **Session Management**: Memory can be explicitly cleared via `end_session` flag | |
| - **Comprehensive Testing**: Full test suite covering memory storage, retrieval, expiration, and multi-step workflows | |
| ### AI-Generated KB Metadata & Advanced RAG (Latest) | |
| - **Automatic Metadata Extraction**: When ingesting documents, system auto-extracts: | |
| - **Title**: From filename, URL, or content structure (with intelligent fallback) | |
| - **Summary**: 2-3 sentence summary via LLM (with keyword-based fallback) | |
| - **Tags**: 5-8 relevant tags extracted from content | |
| - **Topics**: 3-5 main themes identified via LLM | |
| - **Date Detection**: Multiple date formats automatically detected | |
| - **Quality Score**: 0.0-1.0 score based on structure and completeness | |
| - **Intelligent Fallback**: When LLM is unavailable or times out, uses keyword extraction and pattern matching to provide useful metadata | |
| - **Database Integration**: Metadata stored in JSONB column for flexible querying and enhanced RAG search | |
| - **Migration Script**: Safe, idempotent database migration script included | |
| ### Per-Tool Latency Prediction & Context-Aware Routing (Latest) | |
| - **Latency Prediction**: Agent estimates expected latency before tool selection: | |
| - RAG: 60-120ms (depends on result count) | |
| - Web: 400-1800ms (network-dependent) | |
| - Admin: <20ms (local regex matching) | |
| - LLM: Variable based on model and token count | |
| - **Path Optimization**: Agent chooses fastest tool sequence based on latency estimates | |
| - **Context-Aware Routing**: Intelligent tool skipping based on previous outputs: | |
| - High RAG score (≥0.8) → Skip web search | |
| - Critical admin violation → Skip agent reasoning, immediate block | |
| - Relevant memory available → Skip RAG, use memory instead | |
| - **Routing Hints**: Context hints included in reasoning trace for transparency | |
| - **Performance Impact**: Leads to more sophisticated behavior and higher scores | |
| ### Tool Output Schemas (Latest) | |
| - **Strict JSON Schemas**: Every tool returns validated JSON with consistent structure: | |
| - **RAG**: `{results: [...], top_score: float, latency_ms: int}` | |
| - **Web**: `{results: [...], latency_ms: int}` | |
| - **Admin**: `{violations: [...], severity: str, latency_ms: int}` | |
| - **LLM**: `{text: str, tokens_used: int, latency_ms: int}` | |
| - **Automatic Validation**: All tool outputs validated and formatted before use | |
| - **Easier Debugging**: Consistent structure makes debugging and monitoring simpler | |
| - **Polished Responses**: Schema-validated outputs ensure professional appearance | |
| ### Cross-Encoder Re-ranking (Latest) | |
| - **Two-Stage RAG Process**: | |
| - Initial vector search retrieves candidates | |
| - Cross-encoder re-ranks top 10 results for accuracy | |
| - Final filtering by threshold and limit | |
| - **Model**: Uses `cross-encoder/ms-marco-MiniLM-L-6-v2` (very fast, production-ready) | |
| - **Massive Accuracy Improvement**: Re-ranking significantly improves relevance of search results | |
| - **Seamless Integration**: Works transparently with existing RAG search API | |
| ### Context Engineering (Latest) | |
| - **Anthropic-Inspired Strategies**: Implements best practices from Anthropic's context engineering research: | |
| - **Compaction**: High-fidelity summarization preserving architectural decisions, unresolved issues, and implementation details | |
| - **Tool Result Clearing**: Safest form of compaction - removes large tool outputs once processed | |
| - **Structured Note-Taking**: Tracks objectives (like Claude playing Pokémon), architectural decisions, and unresolved issues | |
| - **XML-Structured Prompts**: All prompts use clear XML sections (`<system>`, `<background_information>`, `<instructions>`) for better model understanding | |
| - **Automatic Compression**: Conversations compressed at 80% token threshold, targeting 60% after compression | |
| - **Just-in-Time Context**: Selects only relevant memories and tools for each query | |
| - **Progressive Disclosure**: Agents discover context incrementally through exploration | |
| - **Benefits**: | |
| - Reduced token usage and costs | |
| - Longer conversation support | |
| - Better agent coherence across extended interactions | |
| - Improved performance through structured context | |
| - **Documentation**: Context engineering features are integrated throughout the agent orchestrator and MCP server | |
| ### UI Improvements | |
| - **Modern Drag-and-Drop**: Intuitive file upload with visual feedback | |
| - **Enhanced Status Messages**: Clear success/error messages with icons | |
| - **Refresh Button in Table**: Quick refresh directly from the Rule Set section | |
| - **Better Visual Hierarchy**: Improved spacing, colors, and layout | |
| - **Gradio UI Enhancements**: | |
| - AI metadata displayed after document ingestion | |
| - Latency predictions shown in reasoning trace | |
| - Context-aware routing hints visualized | |
| - Tool output schemas displayed in debug view | |
| ## Key Technical Features | |
| ### Tenant Isolation & Normalization | |
| - **Strict tenant isolation** enforced at database level with `WHERE tenant_id = ...` filters | |
| - **Automatic tenant ID normalization** handles whitespace and formatting differences | |
| - Documents can be listed and deleted consistently across different tenant_id formats | |
| - All operations validate tenant ownership before execution | |
| ### RAG Search & Retrieval | |
| - **Cross-Encoder Re-ranking**: Two-stage retrieval process for massive accuracy improvement: | |
| - First: Vector search retrieves top candidates using embeddings | |
| - Then: Cross-encoder model (`cross-encoder/ms-marco-MiniLM-L-6-v2`) re-ranks top 10 results | |
| - Final: Results filtered by threshold and limit applied | |
| - **Optimized similarity threshold** (default 0.3) for better recall of relevant documents | |
| - **Intelligent fallback** returns top result even if below threshold to ensure knowledge base content is accessible | |
| - **Pattern-based tool selection** automatically triggers RAG for admin questions, fact lookups, and internal knowledge queries | |
| - **Response unwrapping** ensures seamless integration between MCP server and orchestrator | |
| ### MCP Server Architecture | |
| - **Unified server** running on a single port (default 8900) for all namespaced tools | |
| - **Dual protocol support**: Both MCP protocol (POST with JSON) and RESTful HTTP (GET/DELETE) | |
| - **Response wrapping**: Standardized response format with automatic unwrapping in clients | |
| - **Error handling**: Comprehensive error responses with detailed messages for debugging | |
| ## UI Features | |
| ### Knowledge Base Library | |
| - **Visual Statistics**: Real-time document counts and type distribution | |
| - **Interactive Charts**: Plotly pie charts for document type visualization | |
| - **Advanced Search**: Semantic search across all ingested documents with relevance scoring | |
| - **Smart Filtering**: Filter by document type (text, PDF, FAQ, link) | |
| - **Bulk Operations**: Delete individual documents or all documents at once | |
| - **Auto-refresh**: Lists automatically update after operations | |
| ### Admin Analytics Dashboard | |
| - **Statistics Cards**: Key metrics displayed in visually appealing cards with icons | |
| - **Tool Usage Visualization**: Bar charts showing tool invocation counts and performance | |
| - **Latency Metrics**: Visual representation of tool response times | |
| - **RAG Quality Analysis**: Charts displaying search quality metrics (hits, scores, recall) | |
| - **Detailed Tables**: Comprehensive tool usage breakdown with success/error rates | |
| - **Dark Theme**: Modern UI with dark background and white text for better readability | |
| - **Real-time Updates**: Fetch latest analytics data with a single click | |
| ## Acknowledgments | |
| - Built with [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) | |
| - Powered by [Gradio](https://gradio.app/) for the interface | |
| - Visualizations created with [Plotly](https://plotly.com/python/) | |
| - Backend built with [FastAPI](https://fastapi.tiangolo.com/) | |
| - Analytics and governance features inspired by enterprise AI platform requirements | |
| --- | |
| <div align="center"> | |
| **Made with ❤️ for the MCP Hackathon** | |
| **IntegraChat: Enterprise-Grade MCP Autonomous Agent Platform** | |
| [⬆ Back to Top](#integrachat--enterprise-mcp-autonomous-agent-platform) | |
| </div> | |