nothingworry commited on
Commit
d2ac565
·
1 Parent(s): c1a153a

update the readme file

Browse files
Files changed (1) hide show
  1. backend/README.md +28 -1
backend/README.md CHANGED
@@ -49,7 +49,7 @@ cp env.example .env # update MCP URLs + LLM settings
49
  **HTTP Endpoints** (for direct API access):
50
  - `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents
51
  - `POST /rag/ingest` - Ingest content
52
- - `POST /rag/search` - Search documents
53
  - `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document
54
  - `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents
55
  - `POST /web/search` - Web search
@@ -109,6 +109,13 @@ Use the helper scripts in the repo root when validating backend changes:
109
  - RAG client automatically unwraps responses for seamless integration
110
  - Error responses include detailed messages for better debugging
111
 
 
 
 
 
 
 
 
112
  ## Environment Variables (excerpt)
113
 
114
  Defined in `env.example`:
@@ -148,4 +155,24 @@ Agents that speak the Model Context Protocol should connect to the `integrachat`
148
  - The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations
149
  - All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations)
150
  - Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
151
 
 
49
  **HTTP Endpoints** (for direct API access):
50
  - `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents
51
  - `POST /rag/ingest` - Ingest content
52
+ - `POST /rag/search` - Search documents (supports `threshold` parameter, default: 0.3)
53
  - `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document
54
  - `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents
55
  - `POST /web/search` - Web search
 
109
  - RAG client automatically unwraps responses for seamless integration
110
  - Error responses include detailed messages for better debugging
111
 
112
+ ### RAG Search Enhancements
113
+ - **Lowered default threshold** from 0.5 to 0.3 for improved recall of relevant documents
114
+ - **Intelligent fallback mechanism** returns the top result even if similarity score is below threshold, ensuring knowledge base content is always accessible
115
+ - **Configurable threshold** via `threshold` parameter in search requests (default: 0.3)
116
+ - **Enhanced tool selection** automatically triggers RAG for admin questions, fact lookups ("who is", "what is"), and internal knowledge queries
117
+ - **Response unwrapping** in MCP client ensures orchestrator receives properly formatted results for tool scoring and prompt building
118
+
119
  ## Environment Variables (excerpt)
120
 
121
  Defined in `env.example`:
 
155
  - The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations
156
  - All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations)
157
  - Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently
158
+ - RAG search uses a default threshold of 0.3 for better recall; adjust via `threshold` parameter if needed
159
+
160
+ ## Troubleshooting
161
+
162
+ ### RAG Search Not Returning Results
163
+ - **Check similarity threshold**: The default threshold is 0.3. If results are still not found, try lowering it to 0.2 or 0.1
164
+ - **Verify documents are ingested**: Use `GET /rag/list?tenant_id={id}` to confirm documents exist for the tenant
165
+ - **Check tenant ID matching**: Ensure the tenant_id used for search matches the one used for ingestion (normalization handles whitespace automatically)
166
+ - **Review search logs**: Check MCP server logs for search metrics (hits_count, avg_score, top_score)
167
+
168
+ ### Agent Not Using RAG for Knowledge Base Questions
169
+ - **Verify RAG results are being found**: Check the agent debug endpoint (`POST /agent/debug`) to see if RAG results are being pre-fetched
170
+ - **Check tool scores**: The debug output shows `rag_fitness` score; if it's low (< 0.4), the agent may skip RAG
171
+ - **Ensure knowledge base content exists**: Questions like "who is the admin" require relevant content in the knowledge base
172
+ - **Pattern matching**: The tool selector automatically triggers RAG for patterns like "admin", "who is", "what is", but semantic similarity also plays a role
173
+
174
+ ### Document Deletion Issues
175
+ - **404 Not Found**: Verify the document_id exists and belongs to the correct tenant
176
+ - **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested
177
+ - **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence
178