Spaces:

ohmygaugh
/

agent-mcp-sql

No application file

File size: 10,463 Bytes

f831e98
 
 
 
 
da2713e

#### Quick start prompt:  
Scan the repo: list the directory tree, key config files, and required env vars. Summarize how to start the system using existing scripts. Do not modify files; just report and wait for confirmation before any changes. 
Proceed with a fresh start by leveraging: powershell -ExecutionPolicy Bypass -File ops/scripts/fresh_start.ps1


# Graph-Driven Agentic System with Human-in-the-Loop Controls

## What This System Is

This is a **production-ready agentic workflow orchestration system** that demonstrates how to build AI agents with human oversight and complete audit trails. The system combines:

- **🤖 Autonomous AI Agent**: Processes natural language queries and generates SQL
- **📊 Graph Database**: Neo4j stores all workflow metadata and audit trails
- **⏸️ Human-in-the-Loop**: Configurable pause points for human review and intervention
- **🎯 Single API Gateway**: All operations routed through MCP (Model Context Protocol) server
- **🌐 Real-time Interface**: React frontend with live workflow visualization
- **🔍 Complete Observability**: Every action logged with timestamps and relationships

## What It Does

### Core Workflow
1. **User asks a question** in natural language via the web interface
2. **System creates a workflow** with multiple instruction steps in Neo4j
3. **Agent discovers the question** and begins processing
4. **Pause for human review** (5 minutes by default, configurable)
5. **Human can edit instructions** during pause via Neo4j Browser
6. **Agent generates SQL** from natural language using LLM
7. **Agent executes SQL** against PostgreSQL database
8. **Results displayed** in formatted table with complete audit trail

### Architecture Components
```
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Frontend  │────│ MCP Server  │────│    Neo4j    │
│  (Next.js)  │    │ (FastAPI)   │    │  (Graph)    │
└─────────────┘    └─────────────┘    └─────────────┘
                           │
                    ┌─────────────┐    ┌─────────────┐
                    │    Agent    │────│ PostgreSQL  │
                    │  (Python)   │    │   (Data)    │
                    └─────────────┘    └─────────────┘
```

- **Neo4j Graph Database**: Stores workflows, instructions, executions, and logs
- **MCP Server**: FastAPI gateway for all Neo4j operations with parameter fixing
- **Python Agent**: Polls for instructions, pauses for human input, executes tasks
- **PostgreSQL**: Sample data source for SQL generation and execution
- **Next.js Frontend**: Chat interface with Cytoscape.js graph visualization

## Why It's Valuable

### 🎯 **Demonstrates Production Patterns**
- **Human Oversight**: Shows how to build AI systems with meaningful human control
- **Audit Trails**: Complete graph-based logging of all operations and decisions
- **Error Recovery**: System continues gracefully after interruptions or edits
- **Scalable Architecture**: Clean separation of concerns, containerized deployment

### 🔄 **Agentic Workflow Orchestration**
- **Graph-Driven**: Workflows stored as connected nodes, not brittle state machines
- **Dynamic Editing**: Instructions can be modified during execution
- **Sequence Management**: Proper instruction chaining and dependency handling
- **Status Tracking**: Real-time visibility into workflow progress

### 🛡️ **Human-in-the-Loop Controls**
- **Configurable Pauses**: Built-in review periods before critical operations
- **Live Editing**: Modify AI behavior during execution via graph database
- **Stop Controls**: Terminate workflows at any point
- **Parameter Updates**: Change questions, settings, or instructions mid-flight

### 📊 **Complete Observability**
- **Graph Visualization**: Real-time workflow progress with color-coded status
- **Audit Logging**: Every MCP operation logged with timestamps
- **Execution Tracking**: Full history of what was generated and executed
- **Result Storage**: All outputs preserved in queryable graph format

### 🚀 **Production Ready**
- **Containerized**: Full Docker Compose setup with health checks
- **Environment Configuration**: Flexible .env-based configuration
- **Error Handling**: Graceful failures and recovery mechanisms
- **Documentation**: Comprehensive setup, usage, and troubleshooting guides

## How to Make It Run

### Quick Start (5 minutes)

```bash
# 1. Clone and navigate to the repo
git clone <repository-url>
cd <repository-name>

# 2. Copy environment template
cp .env.example .env

# 3. Add your LLM API key to .env
# Edit .env and set: LLM_API_KEY=your-openai-or-anthropic-key-here

# 4. Start everything
docker-compose up -d

# 5. Seed Neo4j with demo data (IMPORTANT!)
docker-compose exec mcp python /app/ops/scripts/seed.py

# 6. Open the interface
# Frontend: http://localhost:3000
# Neo4j Browser: http://localhost:7474 (neo4j/password)
```

### Database Seeding Options

**Basic Seeding** (Quick demo):
```bash
docker-compose exec mcp python /app/ops/scripts/seed.py
```
Creates:
- **Demo Workflow**: A 3-step process (discover schema → generate SQL → review results)  
- **Query Examples**: 3 basic SQL templates for testing
- **Graph Structure**: Proper relationships between components

**Comprehensive Seeding** (Full system):
```bash
docker-compose exec mcp python /app/ops/scripts/seed_comprehensive.py
```
Creates:
- **Workflow Templates**: Multiple workflow patterns (basic query, analysis, reporting)
- **Instruction Type Library**: 6 different instruction types with schemas
- **Query Library**: 6+ categorized SQL examples (basic, analytics, detailed)
- **Demo Workflows**: Ready-to-run and template workflows
- **System Configuration**: Default settings and supported features

**⚠️ Fresh Installation**: On a brand-new machine, Neo4j starts completely empty. You MUST run a seed script to have any workflows or instructions to interact with.

**💡 Recommendation**: Use comprehensive seeding for full system exploration, basic seeding for quick demos.

### PowerShell Fresh Start (Windows)
```powershell
# Fresh deployment with API key
powershell -ExecutionPolicy Bypass -File ops/scripts/fresh_start.ps1 -ApiKey "your-api-key-here"

# Or run the demo (assumes system is already running)
powershell -ExecutionPolicy Bypass -File ops/scripts/demo.ps1
```

### Manual Health Check
```bash
# Check all services
docker-compose ps

# Validate system
docker-compose exec mcp python /app/ops/scripts/validate.py

# Monitor logs
docker-compose logs -f agent
```

### Test the System

1. **Open http://localhost:3000**
2. **Ask a question**: "How many customers do we have?"
3. **Watch the workflow**:
   - Graph visualization shows progress
   - Agent pauses for 5 minutes
   - You can edit instructions in Neo4j Browser
   - Results appear in formatted table

### Clean Reset
```bash
# Stop and clean everything
docker-compose down
docker-compose up -d
docker-compose exec mcp python /app/ops/scripts/seed.py
```

## Key Features for Developers

### Graph Database Schema
- **Workflow** nodes: High-level process containers
- **Instruction** nodes: Individual tasks with parameters and status
- **Execution** nodes: Results of instruction processing
- **Log** nodes: Audit trail of all MCP operations
- **Relationships**: `HAS_INSTRUCTION`, `EXECUTED_AS`, `NEXT_INSTRUCTION`

### Configuration Options
- **Pause Duration**: `PAUSE_DURATION` in .env (default: 300 seconds)
- **Polling Interval**: `AGENT_POLL_INTERVAL` in .env (default: 30 seconds)
- **LLM Model**: `LLM_MODEL` in .env (gpt-4, claude-3-sonnet, etc.)

### Extension Points
- **New Instruction Types**: Add handlers in `agent/main.py`
- **Custom Data Sources**: Extend MCP server with new connectors
- **Frontend Customization**: Modify React components in `frontend/app/`
- **Workflow Templates**: Create reusable instruction sequences

### Human Intervention Examples
```cypher
// Find pending instructions
MATCH (i:Instruction {status: 'pending'}) RETURN i

// Change a question
MATCH (i:Instruction {type: 'generate_sql', status: 'pending'})
SET i.parameters = '{"question": "Show me top 10 customers by revenue"}'

// Stop a workflow
MATCH (w:Workflow {status: 'active'})
SET w.status = 'stopped'
```

## Development Setup

### Prerequisites
- Docker & Docker Compose
- OpenAI or Anthropic API key
- Modern web browser

### Project Structure
```
├── agent/          # Python agent that executes instructions
├── frontend/       # Next.js chat interface
├── mcp/           # FastAPI server for Neo4j operations
├── neo4j/         # Neo4j configuration
├── postgres/      # PostgreSQL setup with sample data
├── ops/scripts/   # Operational scripts (seed, validate, demo)
├── docker-compose.yml
├── Makefile       # Convenience commands
└── README.md      # Detailed documentation
```

### Available Commands
```bash
# If you have make installed
make up          # Start all services
make seed        # Create demo data
make health      # Check service health
make logs        # View all logs
make clean       # Reset everything

# Using docker-compose directly
docker-compose up -d
docker-compose exec mcp python /app/ops/scripts/seed.py
docker-compose ps
docker-compose logs -f
docker-compose down
```

## Use Cases

### 🏢 **Enterprise AI Governance**
- Audit trails for compliance
- Human oversight for critical decisions
- Risk management in AI operations

### 🔬 **Research & Development**
- Experiment with agentic workflows
- Study human-AI collaboration patterns
- Prototype autonomous systems with safety controls

### 📚 **Educational Examples**
- Demonstrate production AI architecture
- Teach graph database concepts
- Show containerized deployment patterns

### 🛠️ **Template for New Projects**
- Fork as starting point for agentic systems
- Adapt components for specific domains
- Scale architecture for production workloads

---

**This system demonstrates that AI agents can be both autonomous and controllable, providing the benefits of automation while maintaining human oversight and complete transparency.**