Spaces:
No application file
No application file
| # Implementation Plan: Intelligent Graph-Based SQL Federation Middleware (Revised) | |
| This document outlines the revised strategy to implement the target features by integrating valuable assets from the `semantic-query-router` codebase into our existing architecture. | |
| ### Overall Strategy | |
| The core task is to evolve the current single-step agent into a multi-step, GraphRAG-powered orchestrator using LangChain. We will enhance the MCP server with advanced core logic, replace PostgreSQL with a rich life sciences SQLite dataset, and transform the Streamlit monitor into a fully conversational chat UI. The `frontend/` Next.js application will be deprecated. | |
| --- | |
| ### Phase 1: Integrate New Dataset & Core Logic (Due by Friday, Oct 3rd) | |
| **Goal**: Replace the existing data foundation with the life sciences dataset and upgrade the MCP server with advanced, reusable logic from the `semantic-query-router` project. | |
| - **Task 1.1: Adopt Life Sciences Dataset** | |
| - Integrate the `generate_sample_databases.py` script into our `ops/scripts/` directory. | |
| - Create a new `make seed-db` command in the `Makefile` to generate the `clinical_trials.db`, `laboratory.db`, and `drug_discovery.db` SQLite files. | |
| - Update `docker-compose.yml` to remove the PostgreSQL service and mount the new `data/` directory for the SQLite databases. | |
| - **Task 1.2: Enhance MCP Server with Core Logic** | |
| - Create a new `mcp/core/` directory. | |
| - Migrate the advanced logic from `semantic-query-router/src/core/` (`discovery.py`, `graph.py`, `intelligence.py`) into our `mcp/core/` directory. | |
| - Refactor these modules to fit our project structure and standards. | |
| - **Task 1.3: Create a Dedicated Ingestion Process** | |
| - Create a new script, `ops/scripts/ingest.py`, that uses the new core logic to perform a one-time ingestion of the SQLite database schemas into Neo4j. | |
| - Create a `make ingest` command in the `Makefile` to run this script. This separates the schema ingestion process from the agent's runtime duties, making the system more modular. | |
| - Remove the schema discovery logic from `agent/main.py`. | |
| --- | |
| ### Phase 2: Rebuild Agent with LangChain (Due by Tuesday, Oct 7th) | |
| **Goal**: Re-architect the agent from a simple script into a robust LangChain-powered orchestrator that leverages the enhanced MCP server. | |
| - **Task 2.1: Refactor Agent to use LangChain** | |
| - Overhaul `agent/main.py` to implement the `AgentExecutor` pattern from `langchain_integration.py`. | |
| - Define a formal agent prompt that instructs the LLM on how to use the available tools to answer questions. | |
| - **Task 2.2: Implement Custom LangChain Tools** | |
| - Create a new `agent/tools.py` file. | |
| - Implement custom LangChain tools that make authenticated REST API calls to our enhanced MCP server. | |
| - The tools will include: `SchemaSearchTool`, `JoinPathFinderTool`, and `QueryExecutorTool`. These tools will act as clients to the powerful logic we integrated into the MCP in Phase 1. | |
| - **Task 2.3: Update Agent's Main Loop** | |
| - Modify the agent's main loop to delegate tasks to the LangChain `AgentExecutor` instead of handling instructions directly. The agent's primary role will now be to orchestrate the LangChain agent and log the results. | |
| --- | |
| ### Phase 3: Build the Chat UI & Finalize (Due by Thursday, Oct 9th) | |
| **Goal**: Replace the basic Streamlit monitor with a full-featured conversational chat interface and complete the final integration for the demo. | |
| - **Task 3.1: Implement Conversational Chat UI** | |
| - Replace the entire contents of `streamlit/app.py` with the conversational UI logic from `semantic-query-router/src/chat_app.py`. | |
| - Adapt the UI to work with our project's MCP REST API (instead of WebSocket) for submitting questions and fetching results. | |
| - **Task 3.2: Integrate Demo-Specific Features** | |
| - Ensure the new Streamlit UI includes the required demo features: | |
| - Display of execution phases (e.g., "Searching Schema," "Finding Join Path," "Executing Query"). | |
| - A final results view that shows both the natural language summary from the agent and a clean data table (Pandas DataFrame) of the raw results. | |
| - A "Download CSV" button for the results table. | |
| - A sidebar that displays the connection status of the Neo4j and SQLite databases. | |
| - **Task 3.3: Final Integration and Testing** | |
| - Perform end-to-end testing of the full workflow: from asking a question in the Streamlit app to the agent's orchestration and the final result display. | |
| - Clean up any unused files and finalize the `README.md` with updated instructions. | |