Spaces:
No application file
A newer version of the Streamlit SDK is available:
1.54.0
Implementation Plan: Intelligent Graph-Based SQL Federation Middleware (Revised)
This document outlines the revised strategy to implement the target features by integrating valuable assets from the semantic-query-router codebase into our existing architecture.
Overall Strategy
The core task is to evolve the current single-step agent into a multi-step, GraphRAG-powered orchestrator using LangChain. We will enhance the MCP server with advanced core logic, replace PostgreSQL with a rich life sciences SQLite dataset, and transform the Streamlit monitor into a fully conversational chat UI. The frontend/ Next.js application will be deprecated.
Phase 1: Integrate New Dataset & Core Logic (Due by Friday, Oct 3rd)
Goal: Replace the existing data foundation with the life sciences dataset and upgrade the MCP server with advanced, reusable logic from the semantic-query-router project.
Task 1.1: Adopt Life Sciences Dataset
- Integrate the
generate_sample_databases.pyscript into ourops/scripts/directory. - Create a new
make seed-dbcommand in theMakefileto generate theclinical_trials.db,laboratory.db, anddrug_discovery.dbSQLite files. - Update
docker-compose.ymlto remove the PostgreSQL service and mount the newdata/directory for the SQLite databases.
- Integrate the
Task 1.2: Enhance MCP Server with Core Logic
- Create a new
mcp/core/directory. - Migrate the advanced logic from
semantic-query-router/src/core/(discovery.py,graph.py,intelligence.py) into ourmcp/core/directory. - Refactor these modules to fit our project structure and standards.
- Create a new
Task 1.3: Create a Dedicated Ingestion Process
- Create a new script,
ops/scripts/ingest.py, that uses the new core logic to perform a one-time ingestion of the SQLite database schemas into Neo4j. - Create a
make ingestcommand in theMakefileto run this script. This separates the schema ingestion process from the agent's runtime duties, making the system more modular. - Remove the schema discovery logic from
agent/main.py.
- Create a new script,
Phase 2: Rebuild Agent with LangChain (Due by Tuesday, Oct 7th)
Goal: Re-architect the agent from a simple script into a robust LangChain-powered orchestrator that leverages the enhanced MCP server.
Task 2.1: Refactor Agent to use LangChain
- Overhaul
agent/main.pyto implement theAgentExecutorpattern fromlangchain_integration.py. - Define a formal agent prompt that instructs the LLM on how to use the available tools to answer questions.
- Overhaul
Task 2.2: Implement Custom LangChain Tools
- Create a new
agent/tools.pyfile. - Implement custom LangChain tools that make authenticated REST API calls to our enhanced MCP server.
- The tools will include:
SchemaSearchTool,JoinPathFinderTool, andQueryExecutorTool. These tools will act as clients to the powerful logic we integrated into the MCP in Phase 1.
- Create a new
Task 2.3: Update Agent's Main Loop
- Modify the agent's main loop to delegate tasks to the LangChain
AgentExecutorinstead of handling instructions directly. The agent's primary role will now be to orchestrate the LangChain agent and log the results.
- Modify the agent's main loop to delegate tasks to the LangChain
Phase 3: Build the Chat UI & Finalize (Due by Thursday, Oct 9th)
Goal: Replace the basic Streamlit monitor with a full-featured conversational chat interface and complete the final integration for the demo.
Task 3.1: Implement Conversational Chat UI
- Replace the entire contents of
streamlit/app.pywith the conversational UI logic fromsemantic-query-router/src/chat_app.py. - Adapt the UI to work with our project's MCP REST API (instead of WebSocket) for submitting questions and fetching results.
- Replace the entire contents of
Task 3.2: Integrate Demo-Specific Features
- Ensure the new Streamlit UI includes the required demo features:
- Display of execution phases (e.g., "Searching Schema," "Finding Join Path," "Executing Query").
- A final results view that shows both the natural language summary from the agent and a clean data table (Pandas DataFrame) of the raw results.
- A "Download CSV" button for the results table.
- A sidebar that displays the connection status of the Neo4j and SQLite databases.
- Ensure the new Streamlit UI includes the required demo features:
Task 3.3: Final Integration and Testing
- Perform end-to-end testing of the full workflow: from asking a question in the Streamlit app to the agent's orchestration and the final result display.
- Clean up any unused files and finalize the
README.mdwith updated instructions.