Spaces:

ohmygaugh
/

agent-mcp-sql

No application file

App Files Files Community

agent-mcp-sql / plan.md

ohmygaugh

planning phase

398a370 5 months ago

preview code

raw

history blame contribute delete

4.67 kB

	# Implementation Plan: Intelligent Graph-Based SQL Federation Middleware (Revised)

	This document outlines the revised strategy to implement the target features by integrating valuable assets from the `semantic-query-router` codebase into our existing architecture.

	### Overall Strategy
	The core task is to evolve the current single-step agent into a multi-step, GraphRAG-powered orchestrator using LangChain. We will enhance the MCP server with advanced core logic, replace PostgreSQL with a rich life sciences SQLite dataset, and transform the Streamlit monitor into a fully conversational chat UI. The `frontend/` Next.js application will be deprecated.

	---

	### Phase 1: Integrate New Dataset & Core Logic (Due by Friday, Oct 3rd)

	Goal: Replace the existing data foundation with the life sciences dataset and upgrade the MCP server with advanced, reusable logic from the `semantic-query-router` project.

	- Task 1.1: Adopt Life Sciences Dataset
	- Integrate the `generate_sample_databases.py` script into our `ops/scripts/` directory.
	- Create a new `make seed-db` command in the `Makefile` to generate the `clinical_trials.db`, `laboratory.db`, and `drug_discovery.db` SQLite files.
	- Update `docker-compose.yml` to remove the PostgreSQL service and mount the new `data/` directory for the SQLite databases.

	- Task 1.2: Enhance MCP Server with Core Logic
	- Create a new `mcp/core/` directory.
	- Migrate the advanced logic from `semantic-query-router/src/core/` (`discovery.py`, `graph.py`, `intelligence.py`) into our `mcp/core/` directory.
	- Refactor these modules to fit our project structure and standards.

	- Task 1.3: Create a Dedicated Ingestion Process
	- Create a new script, `ops/scripts/ingest.py`, that uses the new core logic to perform a one-time ingestion of the SQLite database schemas into Neo4j.
	- Create a `make ingest` command in the `Makefile` to run this script. This separates the schema ingestion process from the agent's runtime duties, making the system more modular.
	- Remove the schema discovery logic from `agent/main.py`.

	---

	### Phase 2: Rebuild Agent with LangChain (Due by Tuesday, Oct 7th)

	Goal: Re-architect the agent from a simple script into a robust LangChain-powered orchestrator that leverages the enhanced MCP server.

	- Task 2.1: Refactor Agent to use LangChain
	- Overhaul `agent/main.py` to implement the `AgentExecutor` pattern from `langchain_integration.py`.
	- Define a formal agent prompt that instructs the LLM on how to use the available tools to answer questions.

	- Task 2.2: Implement Custom LangChain Tools
	- Create a new `agent/tools.py` file.
	- Implement custom LangChain tools that make authenticated REST API calls to our enhanced MCP server.
	- The tools will include: `SchemaSearchTool`, `JoinPathFinderTool`, and `QueryExecutorTool`. These tools will act as clients to the powerful logic we integrated into the MCP in Phase 1.

	- Task 2.3: Update Agent's Main Loop
	- Modify the agent's main loop to delegate tasks to the LangChain `AgentExecutor` instead of handling instructions directly. The agent's primary role will now be to orchestrate the LangChain agent and log the results.

	---

	### Phase 3: Build the Chat UI & Finalize (Due by Thursday, Oct 9th)

	Goal: Replace the basic Streamlit monitor with a full-featured conversational chat interface and complete the final integration for the demo.

	- Task 3.1: Implement Conversational Chat UI
	- Replace the entire contents of `streamlit/app.py` with the conversational UI logic from `semantic-query-router/src/chat_app.py`.
	- Adapt the UI to work with our project's MCP REST API (instead of WebSocket) for submitting questions and fetching results.

	- Task 3.2: Integrate Demo-Specific Features
	- Ensure the new Streamlit UI includes the required demo features:
	- Display of execution phases (e.g., "Searching Schema," "Finding Join Path," "Executing Query").
	- A final results view that shows both the natural language summary from the agent and a clean data table (Pandas DataFrame) of the raw results.
	- A "Download CSV" button for the results table.
	- A sidebar that displays the connection status of the Neo4j and SQLite databases.

	- Task 3.3: Final Integration and Testing
	- Perform end-to-end testing of the full workflow: from asking a question in the Streamlit app to the agent's orchestration and the final result display.
	- Clean up any unused files and finalize the `README.md` with updated instructions.