Spaces:

nothingworry
/

IntegraChat

Sleeping

File size: 16,946 Bytes

# Backend Documentation

This folder contains the production-ready FastAPI stack plus the companion MCP servers that power IntegraChat.

## Directory Overview

- `api/` – FastAPI application (routes, services, storage helpers, MCP clients)
- `mcp_server/` – Unified MCP server exposing rag/web/admin tools via namespaces
- `workers/` – Celery workers and schedulers for async ingestion + analytics maintenance

## Prerequisites

- Python 3.10+
- PostgreSQL (with the `vector` extension) for RAG data, or Supabase with pgvector enabled
- **Supabase (recommended)** for admin rules + analytics storage, with automatic SQLite fallback in `data/`
  - Both `RulesStore` and `AnalyticsStore` automatically detect and use Supabase when configured
  - Falls back to SQLite automatically if Supabase credentials are missing
  - See `SUPABASE_SETUP.md` in the root directory for setup instructions
- Optional: Ollama running locally (default) or Groq API credentials for remote LLMs

Create a virtual environment at the repo root, then:

```bash
pip install -r requirements.txt
cp env.example .env   # update MCP URLs + LLM settings
```

## Running the Services Locally

1. **FastAPI core**  
   ```bash
   uvicorn backend.api.main:app --port 8000 --reload
   ```

2. **Unified MCP server (rag/web/admin)**
   ```bash
   python backend/mcp_server/server.py
   ```
   Or use the provided startup script:
   ```bash
   start.bat  # Windows - launches MCP server on port 8900 and FastAPI on port 8000
   ```
   
   This single server (default port 8900) exposes the following namespaced tools:
   - `rag.search` - Semantic search across tenant documents
   - `rag.ingest` - Ingest text content into knowledge base
   - `rag.delete` - Delete individual or all documents for a tenant
   - `rag.list` - List all documents for a tenant with pagination
   - `web.search` - Google Programmable Search (Custom Search API) web search
   - `admin.getRules`, `admin.addRule`, `admin.deleteRule`, `admin.logViolation`
   
   **HTTP Endpoints** (for direct API access):
   - `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents
   - `POST /rag/ingest` - Ingest content
   - `POST /rag/search` - Search documents (supports `threshold` parameter, default: 0.3)
   - `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document
   - `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents
   - `POST /web/search` - Web search
   - `POST /admin/*` - Admin operations

3. **Optional workers** (if running Celery-based ingestion/analytics jobs):
   ```bash
   celery -A backend.workers.ingestion_worker worker --loglevel=info
   celery -A backend.workers.analytics_worker worker --loglevel=info
   ```

The Gradio UI (`python app.py`) and the Next.js operator console (see `frontend/README.md`) both talk to the FastAPI layer at `http://localhost:8000`.

## Key Endpoints

All endpoints require the `x-tenant-id` header unless otherwise noted.

| Service | Path | Notes |
| --- | --- | --- |
| Agent | `POST /agent/message` | Autonomous orchestration (RAG/Web/Admin/LLM) |
| Agent Debug | `POST /agent/debug` | Full reasoning trace + tool plan |
| Agent Plan | `POST /agent/plan` | Dry-run planning without executing tools |
| RAG | `POST /rag/ingest-document` | Rich ingestion (text, URL, metadata) |
| RAG | `POST /rag/ingest-file` | File upload (PDF/DOCX/TXT/MD) |
| RAG | `GET /rag/list` | Paginated document listing per tenant (requires `x-tenant-id` header) |
| RAG | `DELETE /rag/delete/{document_id}` | Delete specific document (requires `x-tenant-id` header) |
| RAG | `DELETE /rag/delete-all` | Delete all documents for tenant (requires `x-tenant-id` header) |
| Admin | `POST /admin/rules` | Regex + severity rule ingestion |
| Analytics | `GET /analytics/overview` | Summary metrics (queries, tokens, red flags) |

Refer to the root `README.md` for the complete endpoint tables.

## Diagnostics & Tenant Isolation

Use the helper scripts in the repo root when validating backend changes:

- `python verify_tenant_isolation.py` – Exercises analytics logging, admin rule CRUD, API reachability, and proves RAG tenant isolation by ingesting + querying as multiple tenants.
- `python check_rag_database.py` – Talks directly to the pgvector database to list tenant IDs, preview stored chunks, and run safeguarded searches via `search_vectors()`. Helpful when troubleshooting suspected cross-tenant leakage.
- `python verify_supabase_setup.py` – Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using.
- `python check_supabase_rules.py` – Checks Supabase admin rules configuration and RLS policies.
- `python migrate_sqlite_to_supabase.py` – One-shot migration script to copy existing SQLite data to Supabase.
- `python test_manual.py` – Legacy manual smoke test harness (analytics store, admin rules, API surface).

> **Troubleshooting tip:** If the isolation script reports a failure, first run `check_rag_database.py` to confirm documents are tagged with the correct `tenant_id`, then restart the unified MCP server so it reloads the updated SQL filtering logic.

## Recent Improvements

### Tenant ID Normalization
- All database operations now normalize tenant IDs to handle whitespace and formatting differences
- Documents can be listed and deleted consistently even if stored with slightly different tenant_id formatting
- The system automatically matches tenant IDs after normalization, ensuring operations work across different input formats

### HTTP Endpoint Support
- Added GET support for `/rag/list` endpoint (previously POST-only)
- Added DELETE support for `/rag/delete/{document_id}` and `/rag/delete-all` endpoints
- All endpoints support both MCP protocol (POST with JSON payload) and direct HTTP methods (GET/DELETE with query parameters)

### Response Format
- MCP server responses are wrapped in a standard format with `status`, `data`, and `metadata` fields
- RAG client automatically unwraps responses for seamless integration
- Error responses include detailed messages for better debugging

### RAG Search Enhancements
- **Lowered default threshold** from 0.5 to 0.3 for improved recall of relevant documents
- **Intelligent fallback mechanism** returns the top result even if similarity score is below threshold, ensuring knowledge base content is always accessible
- **Configurable threshold** via `threshold` parameter in search requests (default: 0.3)
- **Enhanced tool selection** automatically triggers RAG for admin questions, fact lookups ("who is", "what is"), and internal knowledge queries
- **Response unwrapping** in MCP client ensures orchestrator receives properly formatted results for tool scoring and prompt building

### Conversation Memory System
- **Short-Term Memory**: Automatic storage of tool outputs per session with configurable size limits (default: 10 outputs) and TTL (default: 900 seconds / 15 minutes)
- **Session-Based Isolation**: Memory is keyed by `session_id` (not `tenant_id`) for safety, ensuring no cross-tenant data mixing
- **Automatic Injection**: Recent memory is automatically injected into tool payloads as a `memory` field, enabling tools to make context-aware decisions in multi-step workflows
- **Auto-Expiration**: Memory entries automatically expire after TTL or can be explicitly cleared via `end_session`/`endSession` flag
- **Configuration**: Tune behavior via environment variables:
  - `MCP_MEMORY_MAX_ITEMS`: Maximum number of tool outputs to keep per session (default: 10)
  - `MCP_MEMORY_TTL_SECONDS`: Time-to-live for memory entries in seconds (default: 900)
- **Comprehensive Testing**: Full test suite in `backend/tests/test_conversation_memory.py` covering storage, retrieval, expiration, and multi-step workflows

### UI Enhancements (app.py)
- **Knowledge Base Library Tab**: 
  - Statistics cards showing document counts by type
  - Interactive Plotly pie chart for document type distribution
  - Semantic search with relevance scoring
  - Type filtering (text, PDF, FAQ, link)
  - Document management with preview and deletion
  - Auto-refresh after operations

- **Admin Analytics Tab**:
  - Statistics cards for key metrics (queries, users, red flags, RAG searches)
  - Interactive Plotly bar charts for tool usage, latency, and RAG quality
  - Detailed tool usage table with performance metrics
  - Formatted summary with dark theme styling
  - Real-time data fetching and visualization
  - **Access**: All roles can view analytics (viewer, editor, admin, owner)

- **Debug & Reasoning Tab**:
  - Reasoning trace analyzer showing step-by-step agent decision-making
  - Tool invocation timeline with latency visualization
  - Formatted markdown output with detailed metrics
  - Uses `/agent/debug` endpoint for comprehensive insights

- **Modern UI/UX**:
  - Dark theme with white text for better readability
  - Custom CSS styling for cards and charts
  - Improved error handling and status messages
  - Responsive layout with proper component scaling

### Real-Time Visualization Components (Next.js Frontend)

The Next.js frontend includes three powerful visualization components:

- **Reasoning Path Visualizer**: Step-by-step visualization of agent reasoning with animated progression, status indicators, and detailed metrics. Integrated into chat panel.
- **Tool Invocation Timeline**: Visual timeline showing tool execution order, latency, and result counts. Integrated into chat panel.
- **Tenant Activity Heatmap**: Query activity heatmap and per-tool usage trends. Integrated into analytics page.

All visualizations are accessible to all roles and automatically populate when agent responses include `reasoning_trace` and `tool_traces` data.

## Environment Variables (excerpt)

Defined in `env.example`:

- `RAG_MCP_URL` - Default: `http://localhost:8900/rag` (unified MCP server)
- `WEB_MCP_URL` - Default: `http://localhost:8900/web` (unified MCP server for Google web search)
- `ADMIN_MCP_URL` - Default: `http://localhost:8900/admin` (unified MCP server)
- `MCP_PORT` - Port for unified MCP server (default: 8900)
- `MCP_HOST` - Host for unified MCP server (default: 0.0.0.0)
- `POSTGRESQL_URL` - PostgreSQL connection string with pgvector extension
- `OLLAMA_URL`, `OLLAMA_MODEL` (or `GROQ_API_KEY` + `LLM_BACKEND=groq`)
- `SUPABASE_URL`, `SUPABASE_SERVICE_KEY` - **Required for Supabase backend** (admin rules + analytics)
  - If not set, the system automatically falls back to SQLite in `data/` directory
  - See `SUPABASE_SETUP.md` in the root directory for detailed setup instructions
- `GOOGLE_SEARCH_API_KEY`, `GOOGLE_SEARCH_CX_ID` - Credentials for Google Programmable Search used by `web.search`
- `MCP_MEMORY_MAX_ITEMS` - Maximum number of tool outputs to keep per session (default: 10)
- `MCP_MEMORY_TTL_SECONDS` - Time-to-live for memory entries in seconds (default: 900)
- `APP_ENV`, `LOG_LEVEL`, `API_PORT`

Update these before starting the servers to ensure the agent can reach every MCP endpoint and LLM runtime.

**Note**: The unified MCP server runs on a single port (default 8900) and handles all namespaced tools. The `start.bat` script automatically configures the correct URLs.

## Supabase Configuration

Both `RulesStore` and `AnalyticsStore` support dual-backend storage with automatic detection:

### Setup Steps

1. **Create Supabase tables**:
   - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor (from repo root)
   - Run `supabase_analytics_tables.sql` in Supabase SQL Editor (from repo root)

2. **Configure environment variables** in `.env`:
   ```env
   SUPABASE_URL=https://your-project-id.supabase.co
   SUPABASE_SERVICE_KEY=your_service_role_key_here
   ```

3. **Verify configuration**:
   ```bash
   python verify_supabase_setup.py
   ```

4. **Migrate existing data** (if you have SQLite data):
   ```bash
   python migrate_sqlite_to_supabase.py
   ```

### How It Works

- **Automatic Detection**: Both stores check for `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` at initialization
- **Supabase First**: If credentials are found, Supabase is used automatically
- **SQLite Fallback**: If Supabase is not configured, SQLite databases in `data/` are used
- **Startup Logging**: Check startup logs to see which backend each store is using:
  - `✅ RulesStore: Using Supabase backend`
  - `✅ AnalyticsStore: Using Supabase backend`
  - Or `⚠️  RulesStore: Using SQLite backend` if Supabase is not configured

### Tables Used

- **Admin Rules**: `admin_rules` table in Supabase
- **Analytics**: `tool_usage_events`, `redflag_violations`, `rag_search_events`, `agent_query_events`

See `SUPABASE_SETUP.md` and `SUPABASE_MIGRATION_COMPLETE.md` in the root directory for detailed instructions and troubleshooting.

## Unified MCP tool instructions

Agents that speak the Model Context Protocol should connect to the `integrachat` server id defined in `backend/mcp_server/server.py` and call the namespaced tools directly:

| Namespace | Tool | Purpose | HTTP Endpoint |
| --- | --- | --- | --- |
| `rag` | `search` | Retrieve tenant-scoped document chunks | `POST /rag/search` |
| `rag` | `ingest` | Chunk + store new knowledge | `POST /rag/ingest` |
| `rag` | `list` | List all documents for tenant | `GET /rag/list?tenant_id={id}` |
| `rag` | `delete` | Remove one/all stored documents | `DELETE /rag/delete/{id}?tenant_id={id}` or `DELETE /rag/delete-all?tenant_id={id}` |
| `web` | `search` | Google Programmable Search (Custom Search API) | `POST /web/search` |
| `admin` | `getRules` | Fetch tenant governance rules (list or detailed) | `POST /admin/getRules` |
| `admin` | `addRule` | Insert or update a rule | `POST /admin/addRule` |
| `admin` | `deleteRule` | Remove a rule by text | `POST /admin/deleteRule` |
| `admin` | `logViolation` | Persist a red-flag event into analytics | `POST /admin/logViolation` |

**Important Notes:**
- Always send `tenant_id` in the payload (or as query parameter for GET/DELETE requests) so the shared middleware can enforce isolation and log analytics
- The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations
- All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations)
- Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently
- RAG search uses a default threshold of 0.3 for better recall; adjust via `threshold` parameter if needed
- **Conversation Memory**: Send `session_id` (or `sessionId`/`conversation_id`/`conversationId`) in tool payloads to enable short-term memory. Recent tool outputs are automatically stored and injected into subsequent tool calls as a `memory` field. Send `end_session: true` to clear memory for a session.

## Troubleshooting

### RAG Search Not Returning Results
- **Check similarity threshold**: The default threshold is 0.3. If results are still not found, try lowering it to 0.2 or 0.1
- **Verify documents are ingested**: Use `GET /rag/list?tenant_id={id}` to confirm documents exist for the tenant
- **Check tenant ID matching**: Ensure the tenant_id used for search matches the one used for ingestion (normalization handles whitespace automatically)
- **Review search logs**: Check MCP server logs for search metrics (hits_count, avg_score, top_score)

### Agent Not Using RAG for Knowledge Base Questions
- **Verify RAG results are being found**: Check the agent debug endpoint (`POST /agent/debug`) to see if RAG results are being pre-fetched
- **Check tool scores**: The debug output shows `rag_fitness` score; if it's low (< 0.4), the agent may skip RAG
- **Ensure knowledge base content exists**: Questions like "who is the admin" require relevant content in the knowledge base
- **Pattern matching**: The tool selector automatically triggers RAG for patterns like "admin", "who is", "what is", but semantic similarity also plays a role

### Document Deletion Issues
- **404 Not Found**: Verify the document_id exists and belongs to the correct tenant
- **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested
- **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence

### Supabase Configuration Issues
- **Data still going to SQLite**: Check that `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` are set correctly in `.env` (no quotes, no spaces)
- **Service role key errors**: Make sure you're using the **service_role** key (not anon key) from Supabase Dashboard → Settings → API
- **Tables don't exist**: Run `supabase_admin_rules_table.sql` and `supabase_analytics_tables.sql` in Supabase SQL Editor
- **Permission errors**: Check RLS policies in Supabase allow service role access
- **Startup warnings**: Check FastAPI startup logs to see which backend each store is using (`✅` for Supabase, `⚠️` for SQLite fallback)