Spaces:
Sleeping
Sleeping
| # Backend Documentation | |
| This folder contains the production-ready FastAPI stack plus the companion MCP servers that power IntegraChat. | |
| ## Directory Overview | |
| - `api/` β FastAPI application (routes, services, storage helpers, MCP clients) | |
| - `mcp_server/` β Unified MCP server exposing rag/web/admin tools via namespaces | |
| - `workers/` β Celery workers and schedulers for async ingestion + analytics maintenance | |
| ## Prerequisites | |
| - Python 3.10+ | |
| - PostgreSQL (with the `vector` extension) for RAG data, or Supabase with pgvector enabled | |
| - SQLite (auto-created in `data/`) for analytics and admin rules | |
| - Optional: Ollama running locally (default) or Groq API credentials for remote LLMs | |
| Create a virtual environment at the repo root, then: | |
| ```bash | |
| pip install -r requirements.txt | |
| cp env.example .env # update MCP URLs + LLM settings | |
| ``` | |
| ## Running the Services Locally | |
| 1. **FastAPI core** | |
| ```bash | |
| uvicorn backend.api.main:app --port 8000 --reload | |
| ``` | |
| 2. **Unified MCP server (rag/web/admin)** | |
| ```bash | |
| python backend/mcp_server/server.py | |
| ``` | |
| Or use the provided startup script: | |
| ```bash | |
| start.bat # Windows - launches MCP server on port 8900 and FastAPI on port 8000 | |
| ``` | |
| This single server (default port 8900) exposes the following namespaced tools: | |
| - `rag.search` - Semantic search across tenant documents | |
| - `rag.ingest` - Ingest text content into knowledge base | |
| - `rag.delete` - Delete individual or all documents for a tenant | |
| - `rag.list` - List all documents for a tenant with pagination | |
| - `web.search` - DuckDuckGo-based web search | |
| - `admin.getRules`, `admin.addRule`, `admin.deleteRule`, `admin.logViolation` | |
| **HTTP Endpoints** (for direct API access): | |
| - `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents | |
| - `POST /rag/ingest` - Ingest content | |
| - `POST /rag/search` - Search documents (supports `threshold` parameter, default: 0.3) | |
| - `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document | |
| - `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents | |
| - `POST /web/search` - Web search | |
| - `POST /admin/*` - Admin operations | |
| 3. **Optional workers** (if running Celery-based ingestion/analytics jobs): | |
| ```bash | |
| celery -A backend.workers.ingestion_worker worker --loglevel=info | |
| celery -A backend.workers.analytics_worker worker --loglevel=info | |
| ``` | |
| The Gradio UI (`python app.py`) and the Next.js operator console (see `frontend/README.md`) both talk to the FastAPI layer at `http://localhost:8000`. | |
| ## Key Endpoints | |
| All endpoints require the `x-tenant-id` header unless otherwise noted. | |
| | Service | Path | Notes | | |
| | --- | --- | --- | | |
| | Agent | `POST /agent/message` | Autonomous orchestration (RAG/Web/Admin/LLM) | | |
| | Agent Debug | `POST /agent/debug` | Full reasoning trace + tool plan | | |
| | Agent Plan | `POST /agent/plan` | Dry-run planning without executing tools | | |
| | RAG | `POST /rag/ingest-document` | Rich ingestion (text, URL, metadata) | | |
| | RAG | `POST /rag/ingest-file` | File upload (PDF/DOCX/TXT/MD) | | |
| | RAG | `GET /rag/list` | Paginated document listing per tenant (requires `x-tenant-id` header) | | |
| | RAG | `DELETE /rag/delete/{document_id}` | Delete specific document (requires `x-tenant-id` header) | | |
| | RAG | `DELETE /rag/delete-all` | Delete all documents for tenant (requires `x-tenant-id` header) | | |
| | Admin | `POST /admin/rules` | Regex + severity rule ingestion | | |
| | Analytics | `GET /analytics/overview` | Summary metrics (queries, tokens, red flags) | | |
| Refer to the root `README.md` for the complete endpoint tables. | |
| ## Diagnostics & Tenant Isolation | |
| Use the helper scripts in the repo root when validating backend changes: | |
| - `python verify_tenant_isolation.py` β Exercises analytics logging, admin rule CRUD, API reachability, and proves RAG tenant isolation by ingesting + querying as multiple tenants. | |
| - `python check_rag_database.py` β Talks directly to the pgvector database to list tenant IDs, preview stored chunks, and run safeguarded searches via `search_vectors()`. Helpful when troubleshooting suspected cross-tenant leakage. | |
| - `python test_manual.py` β Legacy manual smoke test harness (analytics store, admin rules, API surface). | |
| > **Troubleshooting tip:** If the isolation script reports a failure, first run `check_rag_database.py` to confirm documents are tagged with the correct `tenant_id`, then restart the unified MCP server so it reloads the updated SQL filtering logic. | |
| ## Recent Improvements | |
| ### Tenant ID Normalization | |
| - All database operations now normalize tenant IDs to handle whitespace and formatting differences | |
| - Documents can be listed and deleted consistently even if stored with slightly different tenant_id formatting | |
| - The system automatically matches tenant IDs after normalization, ensuring operations work across different input formats | |
| ### HTTP Endpoint Support | |
| - Added GET support for `/rag/list` endpoint (previously POST-only) | |
| - Added DELETE support for `/rag/delete/{document_id}` and `/rag/delete-all` endpoints | |
| - All endpoints support both MCP protocol (POST with JSON payload) and direct HTTP methods (GET/DELETE with query parameters) | |
| ### Response Format | |
| - MCP server responses are wrapped in a standard format with `status`, `data`, and `metadata` fields | |
| - RAG client automatically unwraps responses for seamless integration | |
| - Error responses include detailed messages for better debugging | |
| ### RAG Search Enhancements | |
| - **Lowered default threshold** from 0.5 to 0.3 for improved recall of relevant documents | |
| - **Intelligent fallback mechanism** returns the top result even if similarity score is below threshold, ensuring knowledge base content is always accessible | |
| - **Configurable threshold** via `threshold` parameter in search requests (default: 0.3) | |
| - **Enhanced tool selection** automatically triggers RAG for admin questions, fact lookups ("who is", "what is"), and internal knowledge queries | |
| - **Response unwrapping** in MCP client ensures orchestrator receives properly formatted results for tool scoring and prompt building | |
| ### UI Enhancements (app.py) | |
| - **Knowledge Base Library Tab**: | |
| - Statistics cards showing document counts by type | |
| - Interactive Plotly pie chart for document type distribution | |
| - Semantic search with relevance scoring | |
| - Type filtering (text, PDF, FAQ, link) | |
| - Document management with preview and deletion | |
| - Auto-refresh after operations | |
| - **Admin Analytics Tab**: | |
| - Statistics cards for key metrics (queries, users, red flags, RAG searches) | |
| - Interactive Plotly bar charts for tool usage, latency, and RAG quality | |
| - Detailed tool usage table with performance metrics | |
| - Formatted summary with dark theme styling | |
| - Real-time data fetching and visualization | |
| - **Modern UI/UX**: | |
| - Dark theme with white text for better readability | |
| - Custom CSS styling for cards and charts | |
| - Improved error handling and status messages | |
| - Responsive layout with proper component scaling | |
| ## Environment Variables (excerpt) | |
| Defined in `env.example`: | |
| - `RAG_MCP_URL` - Default: `http://localhost:8900/rag` (unified MCP server) | |
| - `WEB_MCP_URL` - Default: `http://localhost:8900/web` (unified MCP server) | |
| - `ADMIN_MCP_URL` - Default: `http://localhost:8900/admin` (unified MCP server) | |
| - `MCP_PORT` - Port for unified MCP server (default: 8900) | |
| - `MCP_HOST` - Host for unified MCP server (default: 0.0.0.0) | |
| - `POSTGRESQL_URL` - PostgreSQL connection string with pgvector extension | |
| - `OLLAMA_URL`, `OLLAMA_MODEL` (or `GROQ_API_KEY` + `LLM_BACKEND=groq`) | |
| - `SUPABASE_URL`, `SUPABASE_SERVICE_KEY` (optional admin integrations) | |
| - `APP_ENV`, `LOG_LEVEL`, `API_PORT` | |
| Update these before starting the servers to ensure the agent can reach every MCP endpoint and LLM runtime. | |
| **Note**: The unified MCP server runs on a single port (default 8900) and handles all namespaced tools. The `start.bat` script automatically configures the correct URLs. | |
| ## Unified MCP tool instructions | |
| Agents that speak the Model Context Protocol should connect to the `integrachat` server id defined in `backend/mcp_server/server.py` and call the namespaced tools directly: | |
| | Namespace | Tool | Purpose | HTTP Endpoint | | |
| | --- | --- | --- | --- | | |
| | `rag` | `search` | Retrieve tenant-scoped document chunks | `POST /rag/search` | | |
| | `rag` | `ingest` | Chunk + store new knowledge | `POST /rag/ingest` | | |
| | `rag` | `list` | List all documents for tenant | `GET /rag/list?tenant_id={id}` | | |
| | `rag` | `delete` | Remove one/all stored documents | `DELETE /rag/delete/{id}?tenant_id={id}` or `DELETE /rag/delete-all?tenant_id={id}` | | |
| | `web` | `search` | DuckDuckGo English-biased search | `POST /web/search` | | |
| | `admin` | `getRules` | Fetch tenant governance rules (list or detailed) | `POST /admin/getRules` | | |
| | `admin` | `addRule` | Insert or update a rule | `POST /admin/addRule` | | |
| | `admin` | `deleteRule` | Remove a rule by text | `POST /admin/deleteRule` | | |
| | `admin` | `logViolation` | Persist a red-flag event into analytics | `POST /admin/logViolation` | | |
| **Important Notes:** | |
| - Always send `tenant_id` in the payload (or as query parameter for GET/DELETE requests) so the shared middleware can enforce isolation and log analytics | |
| - The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations | |
| - All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations) | |
| - Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently | |
| - RAG search uses a default threshold of 0.3 for better recall; adjust via `threshold` parameter if needed | |
| ## Troubleshooting | |
| ### RAG Search Not Returning Results | |
| - **Check similarity threshold**: The default threshold is 0.3. If results are still not found, try lowering it to 0.2 or 0.1 | |
| - **Verify documents are ingested**: Use `GET /rag/list?tenant_id={id}` to confirm documents exist for the tenant | |
| - **Check tenant ID matching**: Ensure the tenant_id used for search matches the one used for ingestion (normalization handles whitespace automatically) | |
| - **Review search logs**: Check MCP server logs for search metrics (hits_count, avg_score, top_score) | |
| ### Agent Not Using RAG for Knowledge Base Questions | |
| - **Verify RAG results are being found**: Check the agent debug endpoint (`POST /agent/debug`) to see if RAG results are being pre-fetched | |
| - **Check tool scores**: The debug output shows `rag_fitness` score; if it's low (< 0.4), the agent may skip RAG | |
| - **Ensure knowledge base content exists**: Questions like "who is the admin" require relevant content in the knowledge base | |
| - **Pattern matching**: The tool selector automatically triggers RAG for patterns like "admin", "who is", "what is", but semantic similarity also plays a role | |
| ### Document Deletion Issues | |
| - **404 Not Found**: Verify the document_id exists and belongs to the correct tenant | |
| - **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested | |
| - **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence | |