Spaces:
Sleeping
Sleeping
Commit
·
d2ac565
1
Parent(s):
c1a153a
update the readme file
Browse files- backend/README.md +28 -1
backend/README.md
CHANGED
|
@@ -49,7 +49,7 @@ cp env.example .env # update MCP URLs + LLM settings
|
|
| 49 |
**HTTP Endpoints** (for direct API access):
|
| 50 |
- `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents
|
| 51 |
- `POST /rag/ingest` - Ingest content
|
| 52 |
-
- `POST /rag/search` - Search documents
|
| 53 |
- `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document
|
| 54 |
- `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents
|
| 55 |
- `POST /web/search` - Web search
|
|
@@ -109,6 +109,13 @@ Use the helper scripts in the repo root when validating backend changes:
|
|
| 109 |
- RAG client automatically unwraps responses for seamless integration
|
| 110 |
- Error responses include detailed messages for better debugging
|
| 111 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
## Environment Variables (excerpt)
|
| 113 |
|
| 114 |
Defined in `env.example`:
|
|
@@ -148,4 +155,24 @@ Agents that speak the Model Context Protocol should connect to the `integrachat`
|
|
| 148 |
- The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations
|
| 149 |
- All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations)
|
| 150 |
- Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 151 |
|
|
|
|
| 49 |
**HTTP Endpoints** (for direct API access):
|
| 50 |
- `GET /rag/list?tenant_id={id}&limit={n}&offset={n}` - List documents
|
| 51 |
- `POST /rag/ingest` - Ingest content
|
| 52 |
+
- `POST /rag/search` - Search documents (supports `threshold` parameter, default: 0.3)
|
| 53 |
- `DELETE /rag/delete/{document_id}?tenant_id={id}` - Delete specific document
|
| 54 |
- `DELETE /rag/delete-all?tenant_id={id}` - Delete all documents
|
| 55 |
- `POST /web/search` - Web search
|
|
|
|
| 109 |
- RAG client automatically unwraps responses for seamless integration
|
| 110 |
- Error responses include detailed messages for better debugging
|
| 111 |
|
| 112 |
+
### RAG Search Enhancements
|
| 113 |
+
- **Lowered default threshold** from 0.5 to 0.3 for improved recall of relevant documents
|
| 114 |
+
- **Intelligent fallback mechanism** returns the top result even if similarity score is below threshold, ensuring knowledge base content is always accessible
|
| 115 |
+
- **Configurable threshold** via `threshold` parameter in search requests (default: 0.3)
|
| 116 |
+
- **Enhanced tool selection** automatically triggers RAG for admin questions, fact lookups ("who is", "what is"), and internal knowledge queries
|
| 117 |
+
- **Response unwrapping** in MCP client ensures orchestrator receives properly formatted results for tool scoring and prompt building
|
| 118 |
+
|
| 119 |
## Environment Variables (excerpt)
|
| 120 |
|
| 121 |
Defined in `env.example`:
|
|
|
|
| 155 |
- The MCP server automatically normalizes tenant IDs to ensure consistent matching across operations
|
| 156 |
- All endpoints support both POST (with JSON payload) and direct HTTP methods (GET for list, DELETE for delete operations)
|
| 157 |
- Tenant ID normalization handles whitespace and ensures documents can be listed and deleted consistently
|
| 158 |
+
- RAG search uses a default threshold of 0.3 for better recall; adjust via `threshold` parameter if needed
|
| 159 |
+
|
| 160 |
+
## Troubleshooting
|
| 161 |
+
|
| 162 |
+
### RAG Search Not Returning Results
|
| 163 |
+
- **Check similarity threshold**: The default threshold is 0.3. If results are still not found, try lowering it to 0.2 or 0.1
|
| 164 |
+
- **Verify documents are ingested**: Use `GET /rag/list?tenant_id={id}` to confirm documents exist for the tenant
|
| 165 |
+
- **Check tenant ID matching**: Ensure the tenant_id used for search matches the one used for ingestion (normalization handles whitespace automatically)
|
| 166 |
+
- **Review search logs**: Check MCP server logs for search metrics (hits_count, avg_score, top_score)
|
| 167 |
+
|
| 168 |
+
### Agent Not Using RAG for Knowledge Base Questions
|
| 169 |
+
- **Verify RAG results are being found**: Check the agent debug endpoint (`POST /agent/debug`) to see if RAG results are being pre-fetched
|
| 170 |
+
- **Check tool scores**: The debug output shows `rag_fitness` score; if it's low (< 0.4), the agent may skip RAG
|
| 171 |
+
- **Ensure knowledge base content exists**: Questions like "who is the admin" require relevant content in the knowledge base
|
| 172 |
+
- **Pattern matching**: The tool selector automatically triggers RAG for patterns like "admin", "who is", "what is", but semantic similarity also plays a role
|
| 173 |
+
|
| 174 |
+
### Document Deletion Issues
|
| 175 |
+
- **404 Not Found**: Verify the document_id exists and belongs to the correct tenant
|
| 176 |
+
- **Tenant ID mismatch**: The system normalizes tenant IDs, but ensure you're using the same tenant_id format as when documents were ingested
|
| 177 |
+
- **Check logs**: Database deletion logs show detailed information about tenant ID matching and document existence
|
| 178 |
|