File size: 16,946 Bytes
e4abb85
 
4749f94
e4abb85
4749f94
e4abb85
4749f94
e44e5dd
4749f94
e4abb85
4749f94
e4abb85
4749f94
 
9155d63
 
 
 
4749f94
e4abb85
4749f94
e4abb85
4749f94
 
 
 
e4abb85
4749f94
e4abb85
4749f94
 
 
 
e4abb85
e44e5dd
4749f94
e44e5dd
4749f94
4c04529
 
 
 
 
 
 
 
 
 
e3ebaba
e44e5dd
4c04529
 
 
 
d2ac565
4c04529
 
 
 
4749f94
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4c04529
 
 
 
4749f94
 
 
 
 
 
 
 
 
 
 
9155d63
 
 
4749f94
 
e44e5dd
4749f94
4c04529
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d2ac565
 
 
 
 
 
 
5bf8ced
 
 
 
 
 
 
 
 
 
b6650bb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9513bb7
 
 
 
 
 
 
b6650bb
 
 
 
 
 
 
9513bb7
 
 
 
 
 
 
 
 
 
4749f94
 
 
 
4c04529
e3ebaba
4c04529
 
 
 
4749f94
9155d63
 
 
e3ebaba
5bf8ced
 
4749f94
 
 
e4abb85
4c04529
 
9155d63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e44e5dd
 
 
 
4c04529
 
 
 
 
 
e3ebaba
4c04529
 
 
 
 
 
 
 
 
 
d2ac565
5bf8ced
d2ac565
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e44e5dd
9155d63
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
# Backend Documentation

This folder contains the production-ready FastAPI stack plus the companion MCP servers that power IntegraChat.

## Directory Overview

- `api/` – FastAPI application (routes, services, storage helpers, MCP clients)
- `mcp_server/` – Unified MCP server exposing rag/web/admin tools via namespaces
- `workers/` – Celery workers and schedulers for async ingestion + analytics maintenance

## Prerequisites

- Python 3.10+
- PostgreSQL (with the `vector` extension) for RAG data, or Supabase with pgvector enabled
- **Supabase (recommended)** for admin rules + analytics storage, with automatic SQLite fallback in `data/`
  - Both `RulesStore` and `AnalyticsStore` automatically detect and use Supabase when configured
  - Falls back to SQLite automatically if Supabase credentials are missing
  - See `SUPABASE_SETUP.md` in the root directory for setup instructions
- Optional: Ollama running locally (default) or Groq API credentials for remote LLMs

Create a virtual environment at the repo root, then:

```bash
pip install -r requirements.txt
cp env.example .env   # update MCP URLs + LLM settings
```

## Running the Services Locally

1. **FastAPI core**  
   ```bash
   uvicorn backend.api.main:app --port 8000 --reload
   ```

2. **Unified MCP server (rag/web/admin)**
   ```bash
   python backend/mcp_server/server.py
   ```
   Or use the provided startup script:
   ```bash
   start.bat  # Windows - launches MCP server on port 8900 and FastAPI on port 8000
   ```
   
   This single server (default port 8900) exposes the following namespaced tools:
   - `rag.search` - Semantic search across tenant documents
   - `rag.ingest` - Ingest text content into knowledge base
   - `rag.delete` - Delete individual or all documents for a tenant
   - `rag.list` - List all documents for a tenant with pagination
   - `web.search` - Google Programmable Search (Custom Search API) web search
   - `admin.getRules`, `admin.addRule`, `admin.deleteRule`, `admin.logViolation`
   
   **HTTP Endpoints** (for direct API access):
   - `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents
   - `POST /rag/ingest` - Ingest content
   - `POST /rag/search` - Search documents (supports `threshold` parameter, default: 0.3)
   - `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document
   - `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents
   - `POST /web/search` - Web search
   - `POST /admin/*` - Admin operations

3. **Optional workers** (if running Celery-based ingestion/analytics jobs):
   ```bash
   celery -A backend.workers.ingestion_worker worker --loglevel=info
   celery -A backend.workers.analytics_worker worker --loglevel=info
   ```

The Gradio UI (`python app.py`) and the Next.js operator console (see `frontend/README.md`) both talk to the FastAPI layer at `http://localhost:8000`.

## Key Endpoints

All endpoints require the `x-tenant-id` header unless otherwise noted.

| Service | Path | Notes |
| --- | --- | --- |
| Agent | `POST /agent/message` | Autonomous orchestration (RAG/Web/Admin/LLM) |
| Agent Debug | `POST /agent/debug` | Full reasoning trace + tool plan |
| Agent Plan | `POST /agent/plan` | Dry-run planning without executing tools |
| RAG | `POST /rag/ingest-document` | Rich ingestion (text, URL, metadata) |
| RAG | `POST /rag/ingest-file` | File upload (PDF/DOCX/TXT/MD) |
| RAG | `GET /rag/list` | Paginated document listing per tenant (requires `x-tenant-id` header) |
| RAG | `DELETE /rag/delete/{document_id}` | Delete specific document (requires `x-tenant-id` header) |
| RAG | `DELETE /rag/delete-all` | Delete all documents for tenant (requires `x-tenant-id` header) |
| Admin | `POST /admin/rules` | Regex + severity rule ingestion |
| Analytics | `GET /analytics/overview` | Summary metrics (queries, tokens, red flags) |

Refer to the root `README.md` for the complete endpoint tables.

## Diagnostics & Tenant Isolation

Use the helper scripts in the repo root when validating backend changes:

- `python verify_tenant_isolation.py` – Exercises analytics logging, admin rule CRUD, API reachability, and proves RAG tenant isolation by ingesting + querying as multiple tenants.
- `python check_rag_database.py` – Talks directly to the pgvector database to list tenant IDs, preview stored chunks, and run safeguarded searches via `search_vectors()`. Helpful when troubleshooting suspected cross-tenant leakage.
- `python verify_supabase_setup.py` – Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using.
- `python check_supabase_rules.py` – Checks Supabase admin rules configuration and RLS policies.
- `python migrate_sqlite_to_supabase.py` – One-shot migration script to copy existing SQLite data to Supabase.
- `python test_manual.py` – Legacy manual smoke test harness (analytics store, admin rules, API surface).

> **Troubleshooting tip:** If the isolation script reports a failure, first run `check_rag_database.py` to confirm documents are tagged with the correct `tenant_id`, then restart the unified MCP server so it reloads the updated SQL filtering logic.

## Recent Improvements

### Tenant ID Normalization
- All database operations now normalize tenant IDs to handle whitespace and formatting differences
- Documents can be listed and deleted consistently even if stored with slightly different tenant_id formatting
- The system automatically matches tenant IDs after normalization, ensuring operations work across different input formats

### HTTP Endpoint Support
- Added GET support for `/rag/list` endpoint (previously POST-only)
- Added DELETE support for `/rag/delete/{document_id}` and `/rag/delete-all` endpoints
- All endpoints support both MCP protocol (POST with JSON payload) and direct HTTP methods (GET/DELETE with query parameters)

### Response Format
- MCP server responses are wrapped in a standard format with `status`, `data`, and `metadata` fields
- RAG client automatically unwraps responses for seamless integration
- Error responses include detailed messages for better debugging

### RAG Search Enhancements
- **Lowered default threshold** from 0.5 to 0.3 for improved recall of relevant documents
- **Intelligent fallback mechanism** returns the top result even if similarity score is below threshold, ensuring knowledge base content is always accessible
- **Configurable threshold** via `threshold` parameter in search requests (default: 0.3)
- **Enhanced tool selection** automatically triggers RAG for admin questions, fact lookups ("who is", "what is"), and internal knowledge queries
- **Response unwrapping** in MCP client ensures orchestrator receives properly formatted results for tool scoring and prompt building

### Conversation Memory System
- **Short-Term Memory**: Automatic storage of tool outputs per session with configurable size limits (default: 10 outputs) and TTL (default: 900 seconds / 15 minutes)
- **Session-Based Isolation**: Memory is keyed by `session_id` (not `tenant_id`) for safety, ensuring no cross-tenant data mixing
- **Automatic Injection**: Recent memory is automatically injected into tool payloads as a `memory` field, enabling tools to make context-aware decisions in multi-step workflows
- **Auto-Expiration**: Memory entries automatically expire after TTL or can be explicitly cleared via `end_session`/`endSession` flag
- **Configuration**: Tune behavior via environment variables:
  - `MCP_MEMORY_MAX_ITEMS`: Maximum number of tool outputs to keep per session (default: 10)
  - `MCP_MEMORY_TTL_SECONDS`: Time-to-live for memory entries in seconds (default: 900)
- **Comprehensive Testing**: Full test suite in `backend/tests/test_conversation_memory.py` covering storage, retrieval, expiration, and multi-step workflows

### UI Enhancements (app.py)
- **Knowledge Base Library Tab**: 
  - Statistics cards showing document counts by type
  - Interactive Plotly pie chart for document type distribution
  - Semantic search with relevance scoring
  - Type filtering (text, PDF, FAQ, link)
  - Document management with preview and deletion
  - Auto-refresh after operations

- **Admin Analytics Tab**:
  - Statistics cards for key metrics (queries, users, red flags, RAG searches)
  - Interactive Plotly bar charts for tool usage, latency, and RAG quality
  - Detailed tool usage table with performance metrics
  - Formatted summary with dark theme styling
  - Real-time data fetching and visualization
  - **Access**: All roles can view analytics (viewer, editor, admin, owner)

- **Debug & Reasoning Tab**:
  - Reasoning trace analyzer showing step-by-step agent decision-making
  - Tool invocation timeline with latency visualization
  - Formatted markdown output with detailed metrics
  - Uses `/agent/debug` endpoint for comprehensive insights

- **Modern UI/UX**:
  - Dark theme with white text for better readability
  - Custom CSS styling for cards and charts
  - Improved error handling and status messages
  - Responsive layout with proper component scaling

### Real-Time Visualization Components (Next.js Frontend)

The Next.js frontend includes three powerful visualization components:

- **Reasoning Path Visualizer**: Step-by-step visualization of agent reasoning with animated progression, status indicators, and detailed metrics. Integrated into chat panel.
- **Tool Invocation Timeline**: Visual timeline showing tool execution order, latency, and result counts. Integrated into chat panel.
- **Tenant Activity Heatmap**: Query activity heatmap and per-tool usage trends. Integrated into analytics page.

All visualizations are accessible to all roles and automatically populate when agent responses include `reasoning_trace` and `tool_traces` data.

## Environment Variables (excerpt)

Defined in `env.example`:

- `RAG_MCP_URL` - Default: `http://localhost:8900/rag` (unified MCP server)
- `WEB_MCP_URL` - Default: `http://localhost:8900/web` (unified MCP server for Google web search)
- `ADMIN_MCP_URL` - Default: `http://localhost:8900/admin` (unified MCP server)
- `MCP_PORT` - Port for unified MCP server (default: 8900)
- `MCP_HOST` - Host for unified MCP server (default: 0.0.0.0)
- `POSTGRESQL_URL` - PostgreSQL connection string with pgvector extension
- `OLLAMA_URL`, `OLLAMA_MODEL` (or `GROQ_API_KEY` + `LLM_BACKEND=groq`)
- `SUPABASE_URL`, `SUPABASE_SERVICE_KEY` - **Required for Supabase backend** (admin rules + analytics)
  - If not set, the system automatically falls back to SQLite in `data/` directory
  - See `SUPABASE_SETUP.md` in the root directory for detailed setup instructions
- `GOOGLE_SEARCH_API_KEY`, `GOOGLE_SEARCH_CX_ID` - Credentials for Google Programmable Search used by `web.search`
- `MCP_MEMORY_MAX_ITEMS` - Maximum number of tool outputs to keep per session (default: 10)
- `MCP_MEMORY_TTL_SECONDS` - Time-to-live for memory entries in seconds (default: 900)
- `APP_ENV`, `LOG_LEVEL`, `API_PORT`

Update these before starting the servers to ensure the agent can reach every MCP endpoint and LLM runtime.

**Note**: The unified MCP server runs on a single port (default 8900) and handles all namespaced tools. The `start.bat` script automatically configures the correct URLs.

## Supabase Configuration

Both `RulesStore` and `AnalyticsStore` support dual-backend storage with automatic detection:

### Setup Steps

1. **Create Supabase tables**:
   - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor (from repo root)
   - Run `supabase_analytics_tables.sql` in Supabase SQL Editor (from repo root)

2. **Configure environment variables** in `.env`:
   ```env
   SUPABASE_URL=https://your-project-id.supabase.co
   SUPABASE_SERVICE_KEY=your_service_role_key_here
   ```

3. **Verify configuration**:
   ```bash
   python verify_supabase_setup.py
   ```

4. **Migrate existing data** (if you have SQLite data):
   ```bash
   python migrate_sqlite_to_supabase.py
   ```

### How It Works

- **Automatic Detection**: Both stores check for `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` at initialization
- **Supabase First**: If credentials are found, Supabase is used automatically
- **SQLite Fallback**: If Supabase is not configured, SQLite databases in `data/` are used
- **Startup Logging**: Check startup logs to see which backend each store is using:
  - `βœ… RulesStore: Using Supabase backend`
  - `βœ… AnalyticsStore: Using Supabase backend`
  - Or `⚠️  RulesStore: Using SQLite backend` if Supabase is not configured

### Tables Used

- **Admin Rules**: `admin_rules` table in Supabase
- **Analytics**: `tool_usage_events`, `redflag_violations`, `rag_search_events`, `agent_query_events`

See `SUPABASE_SETUP.md` and `SUPABASE_MIGRATION_COMPLETE.md` in the root directory for detailed instructions and troubleshooting.

## Unified MCP tool instructions

Agents that speak the Model Context Protocol should connect to the `integrachat` server id defined in `backend/mcp_server/server.py` and call the namespaced tools directly:

| Namespace | Tool | Purpose | HTTP Endpoint |
| --- | --- | --- | --- |
| `rag` | `search` | Retrieve tenant-scoped document chunks | `POST /rag/search` |
| `rag` | `ingest` | Chunk + store new knowledge | `POST /rag/ingest` |
| `rag` | `list` | List all documents for tenant | `GET /rag/list?tenant_id={id}` |
| `rag` | `delete` | Remove one/all stored documents | `DELETE /rag/delete/{id}?tenant_id={id}` or `DELETE /rag/delete-all?tenant_id={id}` |
| `web` | `search` | Google Programmable Search (Custom Search API) | `POST /web/search` |
| `admin` | `getRules` | Fetch tenant governance rules (list or detailed) | `POST /admin/getRules` |
| `admin` | `addRule` | Insert or update a rule | `POST /admin/addRule` |
| `admin` | `deleteRule` | Remove a rule by text | `POST /admin/deleteRule` |
| `admin` | `logViolation` | Persist a red-flag event into analytics | `POST /admin/logViolation` |

**Important Notes:**
- Always send `tenant_id` in the payload (or as query parameter for GET/DELETE requests) so the shared middleware can enforce isolation and log analytics
- The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations
- All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations)
- Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently
- RAG search uses a default threshold of 0.3 for better recall; adjust via `threshold` parameter if needed
- **Conversation Memory**: Send `session_id` (or `sessionId`/`conversation_id`/`conversationId`) in tool payloads to enable short-term memory. Recent tool outputs are automatically stored and injected into subsequent tool calls as a `memory` field. Send `end_session: true` to clear memory for a session.

## Troubleshooting

### RAG Search Not Returning Results
- **Check similarity threshold**: The default threshold is 0.3. If results are still not found, try lowering it to 0.2 or 0.1
- **Verify documents are ingested**: Use `GET /rag/list?tenant_id={id}` to confirm documents exist for the tenant
- **Check tenant ID matching**: Ensure the tenant_id used for search matches the one used for ingestion (normalization handles whitespace automatically)
- **Review search logs**: Check MCP server logs for search metrics (hits_count, avg_score, top_score)

### Agent Not Using RAG for Knowledge Base Questions
- **Verify RAG results are being found**: Check the agent debug endpoint (`POST /agent/debug`) to see if RAG results are being pre-fetched
- **Check tool scores**: The debug output shows `rag_fitness` score; if it's low (< 0.4), the agent may skip RAG
- **Ensure knowledge base content exists**: Questions like "who is the admin" require relevant content in the knowledge base
- **Pattern matching**: The tool selector automatically triggers RAG for patterns like "admin", "who is", "what is", but semantic similarity also plays a role

### Document Deletion Issues
- **404 Not Found**: Verify the document_id exists and belongs to the correct tenant
- **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested
- **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence

### Supabase Configuration Issues
- **Data still going to SQLite**: Check that `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` are set correctly in `.env` (no quotes, no spaces)
- **Service role key errors**: Make sure you're using the **service_role** key (not anon key) from Supabase Dashboard β†’ Settings β†’ API
- **Tables don't exist**: Run `supabase_admin_rules_table.sql` and `supabase_analytics_tables.sql` in Supabase SQL Editor
- **Permission errors**: Check RLS policies in Supabase allow service role access
- **Startup warnings**: Check FastAPI startup logs to see which backend each store is using (`βœ…` for Supabase, `⚠️` for SQLite fallback)