nothingworry commited on
Commit
9155d63
Β·
1 Parent(s): 611e2c1

update the readme files

Browse files
Files changed (2) hide show
  1. README.md +56 -6
  2. backend/README.md +60 -2
README.md CHANGED
@@ -220,11 +220,52 @@ All calls are proxied through the FastAPI backend running at `http://localhost:8
220
 
221
  ### Data Storage
222
 
223
- - **SQLite Databases** (for demo/development):
224
- - `data/admin_rules.db` - Admin rules with regex patterns and severity
225
- - `data/analytics.db` - Analytics events, tool usage, violations, RAG metrics
226
 
227
- - **Production Ready**: Can easily swap SQLite for PostgreSQL/Supabase for production deployments.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
228
 
229
  ---
230
 
@@ -244,6 +285,15 @@ IntegraChat ships with several helper scripts to validate the full stack end-to-
244
  - `python check_rag_database.py`
245
  Provides a low-level inspection of the RAG datastore. It connects straight to the pgvector/Postgres instance, lists all tenant IDs, prints sample chunks, and runs `search_vectors()` directly to ensure the SQL `WHERE tenant_id = …` filter is behaving as expected. Use this script when diagnosing suspected cross-tenant leakage or when seeding demo data.
246
 
 
 
 
 
 
 
 
 
 
247
  - `python test_manual.py`
248
  The existing manual test runner remains useful for smoke-testing analytics logging, admin rule CRUD, and API response codes. Run it whenever you adjust schemas or update MCP endpoints.
249
 
@@ -286,8 +336,8 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
286
  - **UI Libraries**: Plotly for interactive charts, Gradio for web interface
287
  - **LLM Integration**: Ollama (local) or Groq (cloud) via configurable backend with streaming support
288
  - **Vector Store**: pgvector (via Supabase) or SQLite embeddings
289
- - **Analytics**: SQLite with indexed queries for fast analytics
290
- - **Rules Storage**: Supabase (production) or SQLite (development) with automatic table creation
291
  - **MCP Server**: Unified MCP server (port 8900) exposing all tools via namespaces
292
  - **Database**: PostgreSQL with pgvector extension for RAG embeddings, SQLite for analytics
293
  - **File Processing**: Support for TXT, PDF, DOC, DOCX with server-side text extraction (PyPDF2, python-docx)
 
220
 
221
  ### Data Storage
222
 
223
+ IntegraChat supports **dual-backend storage** with automatic fallback:
 
 
224
 
225
+ - **Supabase (Production/Preferred)**:
226
+ - `admin_rules` table - Admin rules with regex patterns and severity
227
+ - `tool_usage_events`, `redflag_violations`, `rag_search_events`, `agent_query_events` - Analytics tables
228
+ - Automatically used when `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` are configured
229
+ - Supports Row Level Security (RLS) for multi-tenant isolation
230
+ - Scalable, production-ready with automatic backups
231
+
232
+ - **SQLite (Development Fallback)**:
233
+ - `data/admin_rules.db` - Admin rules (local)
234
+ - `data/analytics.db` - Analytics events (local)
235
+ - Used automatically when Supabase credentials are not available
236
+ - Perfect for local development and testing
237
+
238
+ **Migration**: Use `python migrate_sqlite_to_supabase.py` to copy existing SQLite data to Supabase. See `SUPABASE_SETUP.md` for detailed setup instructions.
239
+
240
+ ---
241
+
242
+ ## Supabase Setup & Migration
243
+
244
+ IntegraChat supports Supabase for production-ready storage of admin rules and analytics. Both `RulesStore` and `AnalyticsStore` automatically detect and use Supabase when credentials are available, falling back to SQLite for local development.
245
+
246
+ ### Quick Setup
247
+
248
+ 1. **Create Supabase tables**:
249
+ - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor
250
+ - Run `supabase_analytics_tables.sql` in Supabase SQL Editor
251
+
252
+ 2. **Configure environment variables** in `.env`:
253
+ ```env
254
+ SUPABASE_URL=https://your-project-id.supabase.co
255
+ SUPABASE_SERVICE_KEY=your_service_role_key_here
256
+ ```
257
+
258
+ 3. **Migrate existing data** (optional):
259
+ ```bash
260
+ python migrate_sqlite_to_supabase.py
261
+ ```
262
+
263
+ 4. **Verify setup**:
264
+ ```bash
265
+ python verify_supabase_setup.py
266
+ ```
267
+
268
+ See `SUPABASE_SETUP.md` and `SUPABASE_MIGRATION_COMPLETE.md` for detailed instructions and troubleshooting.
269
 
270
  ---
271
 
 
285
  - `python check_rag_database.py`
286
  Provides a low-level inspection of the RAG datastore. It connects straight to the pgvector/Postgres instance, lists all tenant IDs, prints sample chunks, and runs `search_vectors()` directly to ensure the SQL `WHERE tenant_id = …` filter is behaving as expected. Use this script when diagnosing suspected cross-tenant leakage or when seeding demo data.
287
 
288
+ - `python verify_supabase_setup.py`
289
+ Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using. Displays any missing configuration and provides a summary of where data will be saved.
290
+
291
+ - `python check_supabase_rules.py`
292
+ Checks Supabase admin rules configuration and RLS policies. Validates that rules can be read/written correctly.
293
+
294
+ - `python migrate_sqlite_to_supabase.py`
295
+ One-shot migration script that copies existing SQLite data (admin rules + analytics) to Supabase. Supports both PostgreSQL direct connection and Supabase REST API methods.
296
+
297
  - `python test_manual.py`
298
  The existing manual test runner remains useful for smoke-testing analytics logging, admin rule CRUD, and API response codes. Run it whenever you adjust schemas or update MCP endpoints.
299
 
 
336
  - **UI Libraries**: Plotly for interactive charts, Gradio for web interface
337
  - **LLM Integration**: Ollama (local) or Groq (cloud) via configurable backend with streaming support
338
  - **Vector Store**: pgvector (via Supabase) or SQLite embeddings
339
+ - **Analytics**: Supabase (production) or SQLite (development) with indexed queries for fast analytics
340
+ - **Rules Storage**: Supabase (production) or SQLite (development) with automatic detection and fallback
341
  - **MCP Server**: Unified MCP server (port 8900) exposing all tools via namespaces
342
  - **Database**: PostgreSQL with pgvector extension for RAG embeddings, SQLite for analytics
343
  - **File Processing**: Support for TXT, PDF, DOC, DOCX with server-side text extraction (PyPDF2, python-docx)
backend/README.md CHANGED
@@ -12,7 +12,10 @@ This folder contains the production-ready FastAPI stack plus the companion MCP s
12
 
13
  - Python 3.10+
14
  - PostgreSQL (with the `vector` extension) for RAG data, or Supabase with pgvector enabled
15
- - Supabase (preferred) for admin rules + analytics, with automatic SQLite fallback in `data/`
 
 
 
16
  - Optional: Ollama running locally (default) or Groq API credentials for remote LLMs
17
 
18
  Create a virtual environment at the repo root, then:
@@ -88,6 +91,9 @@ Use the helper scripts in the repo root when validating backend changes:
88
 
89
  - `python verify_tenant_isolation.py` – Exercises analytics logging, admin rule CRUD, API reachability, and proves RAG tenant isolation by ingesting + querying as multiple tenants.
90
  - `python check_rag_database.py` – Talks directly to the pgvector database to list tenant IDs, preview stored chunks, and run safeguarded searches via `search_vectors()`. Helpful when troubleshooting suspected cross-tenant leakage.
 
 
 
91
  - `python test_manual.py` – Legacy manual smoke test harness (analytics store, admin rules, API surface).
92
 
93
  > **Troubleshooting tip:** If the isolation script reports a failure, first run `check_rag_database.py` to confirm documents are tagged with the correct `tenant_id`, then restart the unified MCP server so it reloads the updated SQL filtering logic.
@@ -149,13 +155,58 @@ Defined in `env.example`:
149
  - `MCP_HOST` - Host for unified MCP server (default: 0.0.0.0)
150
  - `POSTGRESQL_URL` - PostgreSQL connection string with pgvector extension
151
  - `OLLAMA_URL`, `OLLAMA_MODEL` (or `GROQ_API_KEY` + `LLM_BACKEND=groq`)
152
- - `SUPABASE_URL`, `SUPABASE_SERVICE_KEY` (optional admin integrations)
 
 
153
  - `APP_ENV`, `LOG_LEVEL`, `API_PORT`
154
 
155
  Update these before starting the servers to ensure the agent can reach every MCP endpoint and LLM runtime.
156
 
157
  **Note**: The unified MCP server runs on a single port (default 8900) and handles all namespaced tools. The `start.bat` script automatically configures the correct URLs.
158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
  ## Unified MCP tool instructions
160
 
161
  Agents that speak the Model Context Protocol should connect to the `integrachat` server id defined in `backend/mcp_server/server.py` and call the namespaced tools directly:
@@ -198,3 +249,10 @@ Agents that speak the Model Context Protocol should connect to the `integrachat`
198
  - **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested
199
  - **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence
200
 
 
 
 
 
 
 
 
 
12
 
13
  - Python 3.10+
14
  - PostgreSQL (with the `vector` extension) for RAG data, or Supabase with pgvector enabled
15
+ - **Supabase (recommended)** for admin rules + analytics storage, with automatic SQLite fallback in `data/`
16
+ - Both `RulesStore` and `AnalyticsStore` automatically detect and use Supabase when configured
17
+ - Falls back to SQLite automatically if Supabase credentials are missing
18
+ - See `SUPABASE_SETUP.md` in the root directory for setup instructions
19
  - Optional: Ollama running locally (default) or Groq API credentials for remote LLMs
20
 
21
  Create a virtual environment at the repo root, then:
 
91
 
92
  - `python verify_tenant_isolation.py` – Exercises analytics logging, admin rule CRUD, API reachability, and proves RAG tenant isolation by ingesting + querying as multiple tenants.
93
  - `python check_rag_database.py` – Talks directly to the pgvector database to list tenant IDs, preview stored chunks, and run safeguarded searches via `search_vectors()`. Helpful when troubleshooting suspected cross-tenant leakage.
94
+ - `python verify_supabase_setup.py` – Verifies Supabase configuration and shows which backend (Supabase or SQLite) each store is using.
95
+ - `python check_supabase_rules.py` – Checks Supabase admin rules configuration and RLS policies.
96
+ - `python migrate_sqlite_to_supabase.py` – One-shot migration script to copy existing SQLite data to Supabase.
97
  - `python test_manual.py` – Legacy manual smoke test harness (analytics store, admin rules, API surface).
98
 
99
  > **Troubleshooting tip:** If the isolation script reports a failure, first run `check_rag_database.py` to confirm documents are tagged with the correct `tenant_id`, then restart the unified MCP server so it reloads the updated SQL filtering logic.
 
155
  - `MCP_HOST` - Host for unified MCP server (default: 0.0.0.0)
156
  - `POSTGRESQL_URL` - PostgreSQL connection string with pgvector extension
157
  - `OLLAMA_URL`, `OLLAMA_MODEL` (or `GROQ_API_KEY` + `LLM_BACKEND=groq`)
158
+ - `SUPABASE_URL`, `SUPABASE_SERVICE_KEY` - **Required for Supabase backend** (admin rules + analytics)
159
+ - If not set, the system automatically falls back to SQLite in `data/` directory
160
+ - See `SUPABASE_SETUP.md` in the root directory for detailed setup instructions
161
  - `APP_ENV`, `LOG_LEVEL`, `API_PORT`
162
 
163
  Update these before starting the servers to ensure the agent can reach every MCP endpoint and LLM runtime.
164
 
165
  **Note**: The unified MCP server runs on a single port (default 8900) and handles all namespaced tools. The `start.bat` script automatically configures the correct URLs.
166
 
167
+ ## Supabase Configuration
168
+
169
+ Both `RulesStore` and `AnalyticsStore` support dual-backend storage with automatic detection:
170
+
171
+ ### Setup Steps
172
+
173
+ 1. **Create Supabase tables**:
174
+ - Run `supabase_admin_rules_table.sql` in Supabase SQL Editor (from repo root)
175
+ - Run `supabase_analytics_tables.sql` in Supabase SQL Editor (from repo root)
176
+
177
+ 2. **Configure environment variables** in `.env`:
178
+ ```env
179
+ SUPABASE_URL=https://your-project-id.supabase.co
180
+ SUPABASE_SERVICE_KEY=your_service_role_key_here
181
+ ```
182
+
183
+ 3. **Verify configuration**:
184
+ ```bash
185
+ python verify_supabase_setup.py
186
+ ```
187
+
188
+ 4. **Migrate existing data** (if you have SQLite data):
189
+ ```bash
190
+ python migrate_sqlite_to_supabase.py
191
+ ```
192
+
193
+ ### How It Works
194
+
195
+ - **Automatic Detection**: Both stores check for `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` at initialization
196
+ - **Supabase First**: If credentials are found, Supabase is used automatically
197
+ - **SQLite Fallback**: If Supabase is not configured, SQLite databases in `data/` are used
198
+ - **Startup Logging**: Check startup logs to see which backend each store is using:
199
+ - `βœ… RulesStore: Using Supabase backend`
200
+ - `βœ… AnalyticsStore: Using Supabase backend`
201
+ - Or `⚠️ RulesStore: Using SQLite backend` if Supabase is not configured
202
+
203
+ ### Tables Used
204
+
205
+ - **Admin Rules**: `admin_rules` table in Supabase
206
+ - **Analytics**: `tool_usage_events`, `redflag_violations`, `rag_search_events`, `agent_query_events`
207
+
208
+ See `SUPABASE_SETUP.md` and `SUPABASE_MIGRATION_COMPLETE.md` in the root directory for detailed instructions and troubleshooting.
209
+
210
  ## Unified MCP tool instructions
211
 
212
  Agents that speak the Model Context Protocol should connect to the `integrachat` server id defined in `backend/mcp_server/server.py` and call the namespaced tools directly:
 
249
  - **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested
250
  - **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence
251
 
252
+ ### Supabase Configuration Issues
253
+ - **Data still going to SQLite**: Check that `SUPABASE_URL` and `SUPABASE_SERVICE_KEY` are set correctly in `.env` (no quotes, no spaces)
254
+ - **Service role key errors**: Make sure you're using the **service_role** key (not anon key) from Supabase Dashboard β†’ Settings β†’ API
255
+ - **Tables don't exist**: Run `supabase_admin_rules_table.sql` and `supabase_analytics_tables.sql` in Supabase SQL Editor
256
+ - **Permission errors**: Check RLS policies in Supabase allow service role access
257
+ - **Startup warnings**: Check FastAPI startup logs to see which backend each store is using (`βœ…` for Supabase, `⚠️` for SQLite fallback)
258
+