Spaces:
Sleeping
Sleeping
File size: 11,138 Bytes
e4abb85 4749f94 e4abb85 4749f94 e4abb85 4749f94 e44e5dd 4749f94 e4abb85 4749f94 e4abb85 4749f94 e4abb85 4749f94 e4abb85 4749f94 e4abb85 4749f94 e4abb85 4749f94 e4abb85 e44e5dd 4749f94 e44e5dd 4749f94 4c04529 e44e5dd 4c04529 d2ac565 4c04529 4749f94 4c04529 4749f94 e44e5dd 4749f94 4c04529 d2ac565 b6650bb 4749f94 4c04529 4749f94 e4abb85 4c04529 e44e5dd 4c04529 d2ac565 e44e5dd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
# Backend Documentation
This folder contains the production-ready FastAPI stack plus the companion MCP servers that power IntegraChat.
## Directory Overview
- `api/` β FastAPI application (routes, services, storage helpers, MCP clients)
- `mcp_server/` β Unified MCP server exposing rag/web/admin tools via namespaces
- `workers/` β Celery workers and schedulers for async ingestion + analytics maintenance
## Prerequisites
- Python 3.10+
- PostgreSQL (with the `vector` extension) for RAG data, or Supabase with pgvector enabled
- SQLite (auto-created in `data/`) for analytics and admin rules
- Optional: Ollama running locally (default) or Groq API credentials for remote LLMs
Create a virtual environment at the repo root, then:
```bash
pip install -r requirements.txt
cp env.example .env # update MCP URLs + LLM settings
```
## Running the Services Locally
1. **FastAPI core**
```bash
uvicorn backend.api.main:app --port 8000 --reload
```
2. **Unified MCP server (rag/web/admin)**
```bash
python backend/mcp_server/server.py
```
Or use the provided startup script:
```bash
start.bat # Windows - launches MCP server on port 8900 and FastAPI on port 8000
```
This single server (default port 8900) exposes the following namespaced tools:
- `rag.search` - Semantic search across tenant documents
- `rag.ingest` - Ingest text content into knowledge base
- `rag.delete` - Delete individual or all documents for a tenant
- `rag.list` - List all documents for a tenant with pagination
- `web.search` - DuckDuckGo-based web search
- `admin.getRules`, `admin.addRule`, `admin.deleteRule`, `admin.logViolation`
**HTTP Endpoints** (for direct API access):
- `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents
- `POST /rag/ingest` - Ingest content
- `POST /rag/search` - Search documents (supports `threshold` parameter, default: 0.3)
- `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document
- `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents
- `POST /web/search` - Web search
- `POST /admin/*` - Admin operations
3. **Optional workers** (if running Celery-based ingestion/analytics jobs):
```bash
celery -A backend.workers.ingestion_worker worker --loglevel=info
celery -A backend.workers.analytics_worker worker --loglevel=info
```
The Gradio UI (`python app.py`) and the Next.js operator console (see `frontend/README.md`) both talk to the FastAPI layer at `http://localhost:8000`.
## Key Endpoints
All endpoints require the `x-tenant-id` header unless otherwise noted.
| Service | Path | Notes |
| --- | --- | --- |
| Agent | `POST /agent/message` | Autonomous orchestration (RAG/Web/Admin/LLM) |
| Agent Debug | `POST /agent/debug` | Full reasoning trace + tool plan |
| Agent Plan | `POST /agent/plan` | Dry-run planning without executing tools |
| RAG | `POST /rag/ingest-document` | Rich ingestion (text, URL, metadata) |
| RAG | `POST /rag/ingest-file` | File upload (PDF/DOCX/TXT/MD) |
| RAG | `GET /rag/list` | Paginated document listing per tenant (requires `x-tenant-id` header) |
| RAG | `DELETE /rag/delete/{document_id}` | Delete specific document (requires `x-tenant-id` header) |
| RAG | `DELETE /rag/delete-all` | Delete all documents for tenant (requires `x-tenant-id` header) |
| Admin | `POST /admin/rules` | Regex + severity rule ingestion |
| Analytics | `GET /analytics/overview` | Summary metrics (queries, tokens, red flags) |
Refer to the root `README.md` for the complete endpoint tables.
## Diagnostics & Tenant Isolation
Use the helper scripts in the repo root when validating backend changes:
- `python verify_tenant_isolation.py` β Exercises analytics logging, admin rule CRUD, API reachability, and proves RAG tenant isolation by ingesting + querying as multiple tenants.
- `python check_rag_database.py` β Talks directly to the pgvector database to list tenant IDs, preview stored chunks, and run safeguarded searches via `search_vectors()`. Helpful when troubleshooting suspected cross-tenant leakage.
- `python test_manual.py` β Legacy manual smoke test harness (analytics store, admin rules, API surface).
> **Troubleshooting tip:** If the isolation script reports a failure, first run `check_rag_database.py` to confirm documents are tagged with the correct `tenant_id`, then restart the unified MCP server so it reloads the updated SQL filtering logic.
## Recent Improvements
### Tenant ID Normalization
- All database operations now normalize tenant IDs to handle whitespace and formatting differences
- Documents can be listed and deleted consistently even if stored with slightly different tenant_id formatting
- The system automatically matches tenant IDs after normalization, ensuring operations work across different input formats
### HTTP Endpoint Support
- Added GET support for `/rag/list` endpoint (previously POST-only)
- Added DELETE support for `/rag/delete/{document_id}` and `/rag/delete-all` endpoints
- All endpoints support both MCP protocol (POST with JSON payload) and direct HTTP methods (GET/DELETE with query parameters)
### Response Format
- MCP server responses are wrapped in a standard format with `status`, `data`, and `metadata` fields
- RAG client automatically unwraps responses for seamless integration
- Error responses include detailed messages for better debugging
### RAG Search Enhancements
- **Lowered default threshold** from 0.5 to 0.3 for improved recall of relevant documents
- **Intelligent fallback mechanism** returns the top result even if similarity score is below threshold, ensuring knowledge base content is always accessible
- **Configurable threshold** via `threshold` parameter in search requests (default: 0.3)
- **Enhanced tool selection** automatically triggers RAG for admin questions, fact lookups ("who is", "what is"), and internal knowledge queries
- **Response unwrapping** in MCP client ensures orchestrator receives properly formatted results for tool scoring and prompt building
### UI Enhancements (app.py)
- **Knowledge Base Library Tab**:
- Statistics cards showing document counts by type
- Interactive Plotly pie chart for document type distribution
- Semantic search with relevance scoring
- Type filtering (text, PDF, FAQ, link)
- Document management with preview and deletion
- Auto-refresh after operations
- **Admin Analytics Tab**:
- Statistics cards for key metrics (queries, users, red flags, RAG searches)
- Interactive Plotly bar charts for tool usage, latency, and RAG quality
- Detailed tool usage table with performance metrics
- Formatted summary with dark theme styling
- Real-time data fetching and visualization
- **Modern UI/UX**:
- Dark theme with white text for better readability
- Custom CSS styling for cards and charts
- Improved error handling and status messages
- Responsive layout with proper component scaling
## Environment Variables (excerpt)
Defined in `env.example`:
- `RAG_MCP_URL` - Default: `http://localhost:8900/rag` (unified MCP server)
- `WEB_MCP_URL` - Default: `http://localhost:8900/web` (unified MCP server)
- `ADMIN_MCP_URL` - Default: `http://localhost:8900/admin` (unified MCP server)
- `MCP_PORT` - Port for unified MCP server (default: 8900)
- `MCP_HOST` - Host for unified MCP server (default: 0.0.0.0)
- `POSTGRESQL_URL` - PostgreSQL connection string with pgvector extension
- `OLLAMA_URL`, `OLLAMA_MODEL` (or `GROQ_API_KEY` + `LLM_BACKEND=groq`)
- `SUPABASE_URL`, `SUPABASE_SERVICE_KEY` (optional admin integrations)
- `APP_ENV`, `LOG_LEVEL`, `API_PORT`
Update these before starting the servers to ensure the agent can reach every MCP endpoint and LLM runtime.
**Note**: The unified MCP server runs on a single port (default 8900) and handles all namespaced tools. The `start.bat` script automatically configures the correct URLs.
## Unified MCP tool instructions
Agents that speak the Model Context Protocol should connect to the `integrachat` server id defined in `backend/mcp_server/server.py` and call the namespaced tools directly:
| Namespace | Tool | Purpose | HTTP Endpoint |
| --- | --- | --- | --- |
| `rag` | `search` | Retrieve tenant-scoped document chunks | `POST /rag/search` |
| `rag` | `ingest` | Chunk + store new knowledge | `POST /rag/ingest` |
| `rag` | `list` | List all documents for tenant | `GET /rag/list?tenant_id={id}` |
| `rag` | `delete` | Remove one/all stored documents | `DELETE /rag/delete/{id}?tenant_id={id}` or `DELETE /rag/delete-all?tenant_id={id}` |
| `web` | `search` | DuckDuckGo English-biased search | `POST /web/search` |
| `admin` | `getRules` | Fetch tenant governance rules (list or detailed) | `POST /admin/getRules` |
| `admin` | `addRule` | Insert or update a rule | `POST /admin/addRule` |
| `admin` | `deleteRule` | Remove a rule by text | `POST /admin/deleteRule` |
| `admin` | `logViolation` | Persist a red-flag event into analytics | `POST /admin/logViolation` |
**Important Notes:**
- Always send `tenant_id` in the payload (or as query parameter for GET/DELETE requests) so the shared middleware can enforce isolation and log analytics
- The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations
- All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations)
- Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently
- RAG search uses a default threshold of 0.3 for better recall; adjust via `threshold` parameter if needed
## Troubleshooting
### RAG Search Not Returning Results
- **Check similarity threshold**: The default threshold is 0.3. If results are still not found, try lowering it to 0.2 or 0.1
- **Verify documents are ingested**: Use `GET /rag/list?tenant_id={id}` to confirm documents exist for the tenant
- **Check tenant ID matching**: Ensure the tenant_id used for search matches the one used for ingestion (normalization handles whitespace automatically)
- **Review search logs**: Check MCP server logs for search metrics (hits_count, avg_score, top_score)
### Agent Not Using RAG for Knowledge Base Questions
- **Verify RAG results are being found**: Check the agent debug endpoint (`POST /agent/debug`) to see if RAG results are being pre-fetched
- **Check tool scores**: The debug output shows `rag_fitness` score; if it's low (< 0.4), the agent may skip RAG
- **Ensure knowledge base content exists**: Questions like "who is the admin" require relevant content in the knowledge base
- **Pattern matching**: The tool selector automatically triggers RAG for patterns like "admin", "who is", "what is", but semantic similarity also plays a role
### Document Deletion Issues
- **404 Not Found**: Verify the document_id exists and belongs to the correct tenant
- **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested
- **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence
|