Data_analysis_agent / ARCHITECTURE.md
rohitdeshmukh318's picture
docs: Add detailed ARCHITECTURE.md outlining system logic and tools
bcaacea

System Architecture & Design Document

Overview

The Autonomous Multi-Agent Data Analytics System is a production-grade, state-graph-based AI agent designed to act as an expert data analyst. It converts natural language queries into safe, executable database operations, retrieves exact datasets, visualizes them, and provides human-readable insights.

This document details the architectural decisions, the system workflow, the tools chosen, and the logic underpinning all operations.


๐Ÿ› ๏ธ Technology Stack & "Why" We Chose It

Technology Role Why We Chose It
LangGraph Agent Orchestrator Unlike linear pipelines (like standard LangChain chains), LangGraph allows for cyclic execution. This is critical for our Self-Correction Loop where the agent writes code, hits a syntax error, routes backward, and rewrites the code.
FastAPI Backend API Framework Provides highly concurrent, asynchronous endpoint handling which is essential for our parallel LLM node execution. Automatic OpenAPI documentation (Swagger) is a plus.
React + Vite + Tailwind Frontend Client Vite offers extremely fast HMR (Hot Module Replacement) during development. Tailwind CSS allows for rapid, premium UI component styling, and React's state management perfectly handles the chat streams.
Groq (LLaMA-3.1-70B & 8B) LLM Provider Groq's LPU architecture delivers staggering inference speeds (often >300 tokens/sec). The 70B model handles complex SQL/Pandas generation, while the 8B model is used for extremely fast sub-tasks like intent routing and insight synthesis.
Neon PostgreSQL Metadata & Schema Vector DB Neon offers serverless Postgres with the pgvector extension natively installed. We use it to perform cosine similarity searches over compressed schema tables.
Upstash Redis High-speed Caching Used to cache expensive LLM execution results. If an exact generated SQL query was executed recently, Redis serves the result in <50ms.

๐Ÿ”„ System Workflow & Node Logic

The core logic is orchestrated as a State Machine where a global AgentState is passed from node to node.

1. Intent Router (intent_router.py)

  • Logic: The first node of the graph. It intercepts the natural language query and categorizes it using the fast 8B model into one of three intents: sql, pandas, or insight_only.
  • Implementation Detail: Contains a hardcoded safety override. If a user asks for a "chart" or "visualization", but the LLM tries to classify it as insight_only (which would bypass data extraction), the router forces it back into sql mode.

2. Query Planner & Schema RAG (query_planner.py & graph.py)

  • Logic: Instead of dumping a massive database schema into the LLM prompt (which causes context bloat and hallucination), we perform Retrieval-Augmented Generation (RAG) on the schema.
  • Implementation: The user's query is converted to a vector embedding. We query the schema_embeddings table in our Neon DB to fetch the top 15 most relevant tables. This strict filtering guarantees the AI focuses only on the necessary schema structures.
  • Memory Retrieval: Simultaneously, we query past successful conversational queries and their results. If a user says "filter that by last month", the memory retrieval injects the past SQL query into the context.

3. Code Generation (sql_generator.py / pandas_generator.py)

  • Logic: Uses the heavyweight LLaMA-3.1-70B-versatile model to generate raw syntax.
  • Implementation: It is strictly prompted to output only executable code without markdown fences or explanations to prevent parsing errors down the line.

4. AST Safety Validator (safety_validator.py)

  • Logic: Security is paramount when an AI writes database queries. This pure-Python node verifies the generated code without making any network calls.
  • Implementation:
    • For SQL, it uses regex/parsing to block destructive commands (DROP, DELETE, UPDATE, ALTER, etc.).
    • For Pandas, it parses the Python Abstract Syntax Tree (AST) using Python's built-in ast module to ensure the code does not attempt to import os, sys, or access the local filesystem.

5. Execution Engine (executor.py)

  • Logic: Executes the validated code against the active data connector (Postgres, SQLite, or CSV).
  • Implementation:
    • Caching: Before executing, it hashes the generated SQL string and checks Upstash Redis. If a cache hit occurs, it returns the result in milliseconds.
    • Execution Guardrails: Hard-caps the returned rows to a maximum of 500 to prevent massive memory payloads from crashing the FastAPI server.

6. Error Classification & Self-Correction (error_classifier.py & self_corrector.py)

  • Logic: If execution fails (e.g., column not found, type mismatch), the graph does not just fail. It routes to the Error Classifier.
  • Implementation: The classifier reads the Postgres/Python traceback and maps it to a predefined error class (e.g., nonexistent_table, syntax). The Self-Corrector node takes this specific hint alongside the original failed code, queries the LLM to fix it, and re-routes the state back to the Validation and Execution nodes (up to 3 maximum attempts).

7. Concurrent Output Pipeline (graph.py: _output_pipeline)

  • Logic: Once data is successfully retrieved, three independent tasks must be performed: Anomaly Detection, Chart Visualization, and Natural Language Insight Synthesis.
  • Implementation: To minimize latency, these three nodes are submitted to a ThreadPoolExecutor and run concurrently. The system's final latency for this block is bounded entirely by the slowest node (the LLM Insight Synthesizer), reducing sequential execution time drastically.

๐Ÿงฉ Key Architecture Implementations

Dynamic AI Chat Suggestions

Instead of providing users with static "placeholder" questions (e.g., "What are my top products?"), the system uses a dedicated /api/query/suggest endpoint. When a user connects a database, the backend fetches the top 10 tables of their schema and asks the LLM to generate 3-4 custom, highly relevant analytical questions. These are then streamed to the React frontend, allowing the user to begin data analysis in one click without knowing the database layout.

Abstracted Connector System

To support various data sources cleanly, we implemented a BaseConnector abstract class (connectors/base.py). PostgresConnector, CSVConnector, and SQLiteConnector all inherit from this base class and must implement get_schema(), execute_sql(), and load_dataframe(). This factory pattern allows us to trivially add new data sources (like BigQuery or Snowflake) in the future without changing the core LangGraph agent logic.

Statelessness for Deployment

The application is designed to be completely stateless to support horizontal scaling on platforms like Hugging Face Spaces or Railway.

  • All session history is managed via Session IDs.
  • Caching is offloaded to Upstash Redis.
  • Long-term memory is persisted in the Neon PostgreSQL metadata database. If the FastAPI pod restarts, no conversational context or data analysis progress is permanently lost.