GodSpeed / Docs /ARCHITECTURE.md
Ananth Shyam
feat: implement NL-to-SQL agent with PostgreSQL integration and enhance related documentation
825e852

Complete System Architecture

Document purpose: System-wide architecture covering backend, frontend, database, deployment, and all component interactions. Read this to understand how all pieces fit together.


Table of Contents

  1. High-Level System Diagram
  2. Backend Architecture (src/)
  3. Frontend Architecture (frontend/)
  4. API Contract
  5. Data Flow
  6. Deployment Architecture

High-Level System Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                           EXTERNAL DATA SOURCES                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Notion   β”‚ β”‚ Confluence β”‚ β”‚ GitHub β”‚ β”‚ Slack β”‚ β”‚ Jira  β”‚ β”‚URLs + Firecrawlβ”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚ (Webhooks + Polling)
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                            BACKEND (Python/FastAPI)                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Data Ingestion β”‚  β”‚ RAG + Retrieval  β”‚  β”‚  Analytics & Intelligence   β”‚   β”‚
β”‚  β”‚  β”œβ”€ Adapters    β”‚  β”‚ β”œβ”€ Hybrid search β”‚  β”‚  β”œβ”€ Query events           β”‚   β”‚
β”‚  β”‚  β”œβ”€ Docling     β”‚  β”‚ β”œβ”€ BGE-M3        β”‚  β”‚  β”œβ”€ Knowledge graph        β”‚   β”‚
β”‚  β”‚  β”œβ”€ GLiNER PII  β”‚  β”‚ β”œβ”€ Qdrant        β”‚  β”‚  └─ Anomaly detection      β”‚   β”‚
β”‚  β”‚  └─ Chunking    β”‚  β”‚ └─ LLM agents    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                      β”‚
β”‚         β–²                      β–²                          β–²                     β”‚
β”‚         β”‚                      β”‚                          β”‚                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚              FastAPI Backend (Uvicorn)                             β”‚      β”‚
β”‚  β”‚              β”œβ”€ /api/query/* (search + follow-up)                 β”‚      β”‚
β”‚  β”‚              β”œβ”€ /api/analytics/* (dashboards)                    β”‚      β”‚
β”‚  β”‚              β”œβ”€ /api/admin/* (data source management)            β”‚      β”‚
β”‚  β”‚              └─ /ws (WebSocket for real-time alerts)             β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚         β”‚                                                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚     Data Layer (PostgreSQL, Qdrant, Neo4j, Redis, S3)         β”‚        β”‚
β”‚  β”‚  β”œβ”€ PostgreSQL: Metadata, RBAC, audit trails, queries        β”‚        β”‚
β”‚  β”‚  β”œβ”€ Qdrant: Vector embeddings (dense + sparse)              β”‚        β”‚
β”‚  β”‚  β”œβ”€ Neo4j: Knowledge graph (Service/Library/Incident/Team)  β”‚        β”‚
β”‚  β”‚  β”œβ”€ Redis: Cache, session state, pub/sub, task queues       β”‚        β”‚
β”‚  β”‚  └─ S3: PDFs, user uploads, exports                          β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β”‚ (REST API + WebSocket)
          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        FRONTEND (React/TypeScript)                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Query Interface      β”‚  β”‚  Dashboards      β”‚  β”‚  Admin UI              β”‚  β”‚
β”‚  β”‚  β”œβ”€ Search box        β”‚  β”‚  β”œβ”€ Query trends β”‚  β”‚  β”œβ”€ Data source mgmt   β”‚  β”‚
β”‚  β”‚  β”œβ”€ Results display   β”‚  β”‚  β”œβ”€ Knowledge    β”‚  β”‚  β”œβ”€ User management    β”‚  β”‚
β”‚  β”‚  β”œβ”€ Citations         β”‚  β”‚  β”‚   health      β”‚  β”‚  β”œβ”€ RBAC editor        β”‚  β”‚
β”‚  β”‚  β”œβ”€ Follow-ups        β”‚  β”‚  β”œβ”€ Dependencies β”‚  β”‚  β”œβ”€ API keys           β”‚  β”‚
β”‚  β”‚  └─ Knowledge graph   β”‚  β”‚  └─ Alerts       β”‚  β”‚  └─ System health      β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Component Layer (shadcn/ui + Tailwind)                                 β”‚  β”‚
β”‚  β”‚  β”œβ”€ Query & Search components                                           β”‚  β”‚
β”‚  β”‚  β”œβ”€ Chart & data table components (Recharts, TanStack Table)           β”‚  β”‚
β”‚  β”‚  β”œβ”€ Knowledge graph visualizer (Force-Graph)                           β”‚  β”‚
β”‚  β”‚  β”œβ”€ Authentication flow (JWT)                                          β”‚  β”‚
β”‚  β”‚  └─ Real-time notifications (WebSocket)                               β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  State Management (TanStack Query + Zustand)                            β”‚  β”‚
β”‚  β”‚  β”œβ”€ Server state: Queries, analytics, user data (TanStack Query)      β”‚  β”‚
β”‚  β”‚  └─ Client state: UI state, theme, filters (Zustand)                  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Backend Architecture (src/) β€” Agent-Based Design

Core Principle: Per-Source Agents

Rather than generic adapters flowing through a single pipeline, each data source is an independent agent with:

  • Source-specific authentication & adapters
  • Source-optimized chunking (preserves context like Confluence breadcrumbs, Jira comment threading)
  • Independent Celery tasks (different polling cadences, priorities)
  • Independent FastAPI routers (explicit webhooks like /webhooks/jira)
  • Self-contained testing (test_run.py per agent)

This design ensures scalability by source, operational clarity, and production-grade maintainability.

Directory Structure

Note: The actual repo layout diverges from early plans. The implemented structure is below. src/query_engine/ and src/retrieval/ referenced in earlier design docs do not exist β€” that logic lives in agent/. Graph endpoints live in graph_store/, not src/api/graph.py.

agent/                          # LangGraph multi-agent query engine (IMPLEMENTED)
β”œβ”€β”€ api.py                      # POST /agent/query β€” SSE streaming endpoint
β”œβ”€β”€ graph.py                    # LangGraph build: planner β†’ [doc_search|ticket_lookup|live_docs|sql_query] β†’ join β†’ synthesiser β†’ guardrail
β”œβ”€β”€ models.py                   # KnowledgeGraphState, QueryInput, ExecutionPlan, AgentResult, RetrievedChunk
β”œβ”€β”€ config.py                   # LLM + agent config
β”œβ”€β”€ prompts.py                  # Prompt templates
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ planner.py              # Breaks query into AgentTask list
β”‚   β”œβ”€β”€ synthesiser.py          # Streams answer tokens from top chunks
β”‚   β”œβ”€β”€ guardrail.py            # Validates answer against sources; sets escalate flag
β”‚   └── _gemini.py              # Gemini client helper (used in planner/synthesiser)
└── tools/
    β”œβ”€β”€ doc_search.py           # Qdrant hybrid dense+sparse search
    β”œβ”€β”€ ticket_lookup.py        # Jira-specific retrieval
    β”œβ”€β”€ live_docs.py            # Firecrawl real-time doc fetching
    β”œβ”€β”€ sql_query.py            # NL-to-SQL: translates query β†’ validated SELECT β†’ asyncpg execution
    └── summariser.py           # Context compression before synthesis

graph_store/                    # Neo4j knowledge graph (IMPLEMENTED)
β”œβ”€β”€ api.py                      # GET /graph/nodes, POST /graph/ingest, GET /graph/traverse
β”œβ”€β”€ stream.py                   # WS /graph/stream β€” streams nodes+edges with 50ms delay
β”œβ”€β”€ extractor.py                # Gemini 2.5 Pro entity+relationship extraction (4 types, whitelist rels)
β”œβ”€β”€ writer.py                   # Async Neo4j MERGE upserts, index creation
β”œβ”€β”€ reader.py                   # Cypher traversal: incidentβ†’serviceβ†’libraryβ†’chunks
β”œβ”€β”€ models.py                   # ExtractedEntity, ExtractedRelationship, ExtractionResult
└── config.py                   # Neo4j connection settings

src/
β”œβ”€β”€ agents_app.py               # Combined FastAPI app: all agent routers + Qdrant/Redis init
β”‚
β”œβ”€β”€ jira_agent/                 # JIRA ingestion agent (IMPLEMENTED)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ config.py               # JiraAgentConfig β€” JIRA_BASE_URL, JIRA_EMAIL, JIRA_API_TOKEN,
β”‚   β”‚                           #   JIRA_PROJECT_KEYS (csv), JIRA_WEBHOOK_SECRET, TEAM_ID
β”‚   β”œβ”€β”€ adapter.py              # JiraAdapter β€” fetch_issue, fetch_all (JQL), fetch_incremental
β”‚   β”‚                           #   Basic auth (base64 email:api_token), ADF text extraction
β”‚   β”œβ”€β”€ chunker.py              # chunk_jira_issue β†’ chunk 0: issue body, chunks 1..N: comments
β”‚   β”‚                           #   Preserves thread structure for relation extraction
β”‚   β”œβ”€β”€ pipeline.py             # ingest_issue / ingest_project β†’ chunk β†’ PII mask β†’ embed β†’ Qdrant
β”‚   β”‚                           #   Returns entity graph nodes for real-time streaming
β”‚   β”œβ”€β”€ tasks.py                # Celery: jira_process_issue (queue=critical), 
β”‚   β”‚                           #   jira_sync_project (queue=polling)
β”‚   β”œβ”€β”€ router.py               # FastAPI: POST /webhooks/jira, POST /jira/sync/{project_key}
β”‚   └── test_run.py             # Mock + real runthrough; works without credentials
β”‚
β”œβ”€β”€ confluence_agent/           # Confluence ingestion agent (IMPLEMENTED)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ config.py               # ConfluenceAgentConfig β€” BASE_URL, TOKEN, EMAIL,
β”‚   β”‚                           #   CONFLUENCE_SPACES (csv), CONFLUENCE_WEBHOOK_SECRET, TEAM_ID
β”‚   β”œβ”€β”€ adapter.py              # ConfluenceAdapter β€” fetch_page, fetch_space, fetch_incremental (CQL)
β”‚   β”‚                           #   REST v2 API with pagination
β”‚   β”œβ”€β”€ chunker.py              # chunk_confluence_page β€” BeautifulSoup heading-split + breadcrumbs
β”‚   β”‚                           #   [Space > Ancestor > Page] prefix on every chunk; tables = 1 chunk each
β”‚   β”‚                           #   Preserves hierarchy for entity linking
β”‚   β”œβ”€β”€ pipeline.py             # ingest_page / ingest_space β†’ chunk β†’ PII mask β†’ embed β†’ Qdrant
β”‚   β”‚                           #   Returns entity graph nodes
β”‚   β”œβ”€β”€ tasks.py                # Celery: confluence_process_page (queue=critical), 
β”‚   β”‚                           #   confluence_sync_space (queue=polling),
β”‚   β”‚                           #   confluence_periodic_sync (beat, 60 min incremental sync)
β”‚   β”œβ”€β”€ router.py               # FastAPI: POST /webhooks/confluence, POST /confluence/sync/{space_key}
β”‚   β”‚                           #   POST /confluence/search (for admin dashboard)
β”‚   └── test_run.py             # Mock + real runthrough; works without credentials
β”‚
β”œβ”€β”€ file_agent/                 # File ingestion agent (IMPLEMENTED)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ config.py               # FileAgentConfig β€” UPLOAD_DIR, MAX_FILE_SIZE, ALLOWED_TYPES
β”‚   β”œβ”€β”€ adapter.py              # FileAdapter β€” handle PDFs, DOCX, PPTX, TXT
β”‚   β”‚                           #   Uses docling for multi-format parsing
β”‚   β”œβ”€β”€ chunker.py              # chunk_file_document β€” respects document structure (sections, pages)
β”‚   β”œβ”€β”€ pipeline.py             # ingest_file β†’ chunk β†’ PII mask β†’ embed β†’ Qdrant
β”‚   β”œβ”€β”€ tasks.py                # Celery: file_process_upload (queue=critical)
β”‚   β”œβ”€β”€ router.py               # FastAPI: POST /files/upload, GET /files/{file_id}
β”‚   └── test_run.py
β”‚
β”œβ”€β”€ shared/                     # Shared utilities (used by all agents)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ pii_masker.py           # GLiNER-based PII detection (local, zero egress)
β”‚   β”œβ”€β”€ embedder.py             # BGE-M3 embeddings (local inference)
β”‚   β”œβ”€β”€ qdrant_client.py        # Qdrant connection + upsert helpers
β”‚   β”œβ”€β”€ entity_extractor.py     # Extract entities/relationships from chunks (used per-agent)
β”‚   β”œβ”€β”€ models.py               # Pydantic models (RawDocument, ChunkedDocument, Entity, Graph)
β”‚   └── config.py               # Shared config (QDRANT_URL, REDIS_URL, etc.)
β”‚
β”œβ”€β”€ retrieval/                  # T1, T2, T3 retrieval layers (shared across queries)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ hybrid_search.py        # T1: Dense + Sparse (RRF fusion) β€” queries Qdrant
β”‚   β”œβ”€β”€ reranker.py             # BGE-reranker-v2-m3 integration
β”‚   β”œβ”€β”€ context_compressor.py   # Compress top-5 into LLM context
β”‚   β”œβ”€β”€ cag_agent.py            # T2: Cache-Augmented Generation (recent syncs)
β”‚   β”œβ”€β”€ live_doc_agent.py       # T3: Real-time doc fetching (Firecrawl)
β”‚   └── models.py               # Pydantic models for retrieval
β”‚
β”œβ”€β”€ query_engine/               # Query execution (LangGraph-based)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ generator_agent.py      # Generator LLM agent (creates answer from context)
β”‚   β”œβ”€β”€ critic_agent.py         # Critic LLM agent (validates against sources)
β”‚   β”œβ”€β”€ orchestrator.py         # LangGraph: routes query through retrieval β†’ generation β†’ validation
β”‚   β”œβ”€β”€ streaming.py            # Stream answer chunks + citations + graph to frontend
β”‚   └── models.py               # Pydantic models for query responses
β”‚
β”œβ”€β”€ redis/                      # Redis utilities (shared)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ cache.py                # Caching layer (with TTL)
β”‚   β”œβ”€β”€ queues.py               # Task queues (per-agent ingestion, webhook events)
β”‚   β”œβ”€β”€ session_state.py        # Query session state
β”‚   β”œβ”€β”€ locks.py                # Distributed locks (prevent concurrent agent syncs)
β”‚   └── pubsub.py               # Pub/sub for real-time graph updates to frontend (query_id β†’ node)
β”‚
β”œβ”€β”€ api/                        # FastAPI main app + shared endpoints
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ auth.py                 # POST /auth/login, /auth/logout, /auth/refresh
β”‚   β”œβ”€β”€ query.py                # POST /api/query (streaming), /api/query/{id}/follow-up
β”‚   β”œβ”€β”€ workspace.py            # GET/POST /api/workspace/queries, /saved
β”‚   β”œβ”€β”€ admin.py                # GET /api/admin/agents (show all agent statuses)
β”‚   └── graph.py                # GET /api/graph/entities, /api/graph/query/{query_id}
β”‚
β”œβ”€β”€ db/                         # Database models & utilities
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ models.py               # SQLAlchemy models (User, Query, Document, Entity, Graph)
β”‚   β”œβ”€β”€ session.py              # Database session management
β”‚   └── init_db.py              # Schema initialization
β”‚
β”œβ”€β”€ auth/                       # Authentication & authorization
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ jwt_handler.py          # JWT encode/decode, token refresh
β”‚   β”œβ”€β”€ oauth.py                # OAuth2 + SSO integration (phase 2)
β”‚   β”œβ”€β”€ rbac.py                 # Role-based access control decorator
β”‚   β”œβ”€β”€ permissions.py          # Permission checks
β”‚   └── models.py               # User, Role, Permission models
β”‚
β”œβ”€β”€ utils/                      # Shared utilities
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ logger.py               # Structured logging (JSON)
β”‚   β”œβ”€β”€ metrics.py              # Prometheus metrics
β”‚   β”œβ”€β”€ telemetry.py            # OpenTelemetry (phase 2)
β”‚   └── exceptions.py           # Custom exceptions
β”‚
└── tests/                      # Comprehensive test suite
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ agents/                 # Per-agent tests (JIRA, Confluence, File)
    β”œβ”€β”€ retrieval/              # Retrieval pipeline tests
    β”œβ”€β”€ query_engine/           # Query generation + validation tests
    β”œβ”€β”€ fixtures/               # Pytest fixtures (mock data)
    └── integration/            # End-to-end scenarios

Key Backend Design Decisions

  1. Per-Source Agents: Each source (Jira, Confluence, File) is an independent module with its own adapter, chunker, pipeline, and Celery tasks. This enables source-specific optimization and independent scaling.

  2. Source-Optimized Chunking:

    • Confluence: Preserves [Space > Ancestor > Page] hierarchy for entity linking
    • Jira: Preserves comment threading for relation extraction
    • File: Respects document structure (sections, pages)
    • Each source extracts its own entity relationships
  3. Independent Celery Scheduling:

    • jira_sync_project β†’ configurable interval (often 1 hour)
    • confluence_periodic_sync β†’ beat scheduler (60 min incremental)
    • file_process_upload β†’ immediate (queue=critical)
    • Each agent controls its own cadence
  4. PII Masking First: GLiNER runs in shared/pii_masker.py β€” local, zero-egress, runs before Qdrant indexing.

  5. Entity Extraction Per-Agent: Each pipeline returns a graph of entities + relationships (e.g., Jira: issue→linked_issue, Confluence: page→linked_page). Frontend streams these nodes as they're extracted.

  6. Real-Time Graph Streaming: Via Redis pub/sub (query_id β†’ {nodes, edges}) β€” frontend doesn't wait for full completion.

  7. Redis Everywhere: Cache, queues, session state, distributed locks, and pub/sub all via Redis.

  8. Hybrid Retrieval (T1): Dense (BGE-M3) + Sparse (BM25) via RRF β€” queries Qdrant.


Frontend Architecture (frontend/)

Directory Structure

frontend/
β”œβ”€β”€ index.html                   # Entry HTML (Vite serves this)
β”œβ”€β”€ vite.config.ts              # Vite build config
β”œβ”€β”€ tsconfig.json               # TypeScript config
β”œβ”€β”€ tailwind.config.ts          # Tailwind design tokens + dark mode
β”œβ”€β”€ postcss.config.js           # PostCSS + Tailwind plugins
β”œβ”€β”€ package.json                # Dependencies + scripts
β”œβ”€β”€ .env.example                # Required environment variables
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.tsx                # React app entry point
β”‚   β”œβ”€β”€ App.tsx                 # Root component + routing
β”‚   β”‚
β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”œβ”€β”€ common/             # Reusable components
β”‚   β”‚   β”‚   β”œβ”€β”€ Header.tsx      # Top nav bar
β”‚   β”‚   β”‚   β”œβ”€β”€ Sidebar.tsx     # Left navigation
β”‚   β”‚   β”‚   β”œβ”€β”€ Footer.tsx      # Footer
β”‚   β”‚   β”‚   β”œβ”€β”€ Button.tsx      # Button variants (from shadcn)
β”‚   β”‚   β”‚   β”œβ”€β”€ Input.tsx       # Text input (from shadcn)
β”‚   β”‚   β”‚   β”œβ”€β”€ Card.tsx        # Card container
β”‚   β”‚   β”‚   β”œβ”€β”€ Modal.tsx       # Modal/dialog
β”‚   β”‚   β”‚   β”œβ”€β”€ Badge.tsx       # Status badges
β”‚   β”‚   β”‚   β”œβ”€β”€ Tooltip.tsx     # Tooltips
β”‚   β”‚   β”‚   β”œβ”€β”€ Toast.tsx       # Toast notifications
β”‚   β”‚   β”‚   └── Loading.tsx     # Loading skeleton
β”‚   β”‚   β”‚
β”‚   β”‚   β”œβ”€β”€ query/              # Query interface (Engineer primary)
β”‚   β”‚   β”‚   β”œβ”€β”€ SearchBox.tsx   # Main search input (Cmd+K support)
β”‚   β”‚   β”‚   β”œβ”€β”€ QueryModal.tsx  # Modal for new query
β”‚   β”‚   β”‚   β”œβ”€β”€ QueryHistory.tsx # Query history panel
β”‚   β”‚   β”‚   β”œβ”€β”€ SuggestedTopics.tsx # Related queries
β”‚   β”‚   β”‚   └── QueryFeedback.tsx # Thumbs up/down
β”‚   β”‚   β”‚
β”‚   β”‚   β”œβ”€β”€ results/            # Results display + knowledge graph
β”‚   β”‚   β”‚   β”œβ”€β”€ ResultsPage.tsx # Main results container
β”‚   β”‚   β”‚   β”œβ”€β”€ Answer.tsx      # Generated answer with citations
β”‚   β”‚   β”‚   β”œβ”€β”€ Citations.tsx   # Cited source chunks
β”‚   β”‚   β”‚   β”œβ”€β”€ FollowUp.tsx    # Follow-up prompt
β”‚   β”‚   β”‚   β”œβ”€β”€ KnowledgeGraph.tsx # Knowledge graph visualization
β”‚   β”‚   β”‚   β”œβ”€β”€ GraphNode.tsx   # Individual node component
β”‚   β”‚   β”‚   β”œβ”€β”€ RelatedDocs.tsx # Related document snippets
β”‚   β”‚   β”‚   └── ShareResults.tsx # Share/export options
β”‚   β”‚   β”‚
β”‚   β”‚   β”œβ”€β”€ analytics/          # Dashboards (Manager primary)
β”‚   β”‚   β”‚   β”œβ”€β”€ AnalyticsDashboard.tsx # Main analytics page
β”‚   β”‚   β”‚   β”œβ”€β”€ QueryTrendChart.tsx # Line chart for query volume
β”‚   β”‚   β”‚   β”œβ”€β”€ TopicsChart.tsx # Bar chart for topics
β”‚   β”‚   β”‚   β”œβ”€β”€ SuccessRateGauge.tsx # Gauge chart
β”‚   β”‚   β”‚   β”œβ”€β”€ KnowledgeHealthDashboard.tsx # Health metrics
β”‚   β”‚   β”‚   β”œβ”€β”€ DependencyTracker.tsx # Breaking changes table
β”‚   β”‚   β”‚   β”œβ”€β”€ EscalationTable.tsx # Unresolved queries
β”‚   β”‚   β”‚   β”œβ”€β”€ TeamSettings.tsx # Team configuration
β”‚   β”‚   β”‚   └── AnalyticsExport.tsx # Export reports
β”‚   β”‚   β”‚
β”‚   β”‚   β”œβ”€β”€ admin/              # Admin UI (Admin primary)
β”‚   β”‚   β”‚   β”œβ”€β”€ AdminDashboard.tsx # Main admin page
β”‚   β”‚   β”‚   β”œβ”€β”€ SystemHealth.tsx # Health status cards
β”‚   β”‚   β”‚   β”œβ”€β”€ DataSourceManager.tsx # Add/edit sources
β”‚   β”‚   β”‚   β”œβ”€β”€ DataSourceForm.tsx # Source configuration wizard
β”‚   β”‚   β”‚   β”œβ”€β”€ UserManager.tsx # User list + invite
β”‚   β”‚   β”‚   β”œβ”€β”€ RBACEditor.tsx  # RBAC policy editor
β”‚   β”‚   β”‚   β”œβ”€β”€ APIKeyManager.tsx # Generate/revoke keys
β”‚   β”‚   β”‚   └── SystemLogs.tsx  # View logs + alerts
β”‚   β”‚   β”‚
β”‚   β”‚   └── auth/               # Authentication UI
β”‚   β”‚       β”œβ”€β”€ LoginPage.tsx   # Login form (SSO + fallback)
β”‚   β”‚       β”œβ”€β”€ SSORedirect.tsx # OAuth callback handler
β”‚   β”‚       └── ProtectedRoute.tsx # Route guard
β”‚   β”‚
β”‚   β”œβ”€β”€ pages/                  # Route pages (using TanStack Router)
β”‚   β”‚   β”œβ”€β”€ Home.tsx            # Dashboard home
β”‚   β”‚   β”œβ”€β”€ QueryPage.tsx       # Query results page
β”‚   β”‚   β”œβ”€β”€ AnalyticsPage.tsx   # Analytics dashboards
β”‚   β”‚   β”œβ”€β”€ AdminPage.tsx       # Admin dashboards
β”‚   β”‚   β”œβ”€β”€ WorkspacePage.tsx   # Personal/team workspace
β”‚   β”‚   β”œβ”€β”€ NotFoundPage.tsx    # 404 page
β”‚   β”‚   └── ErrorPage.tsx       # Error boundary
β”‚   β”‚
β”‚   β”œβ”€β”€ hooks/                  # Custom React hooks
β”‚   β”‚   β”œβ”€β”€ useSSEStream.ts     # SSE consumer for POST /agent/query β€” manages fetch + ReadableStream parsing
β”‚   β”‚   β”œβ”€β”€ useGraphStream.ts   # WebSocket consumer for WS /graph/stream β€” feeds Force-Graph 2D progressively
β”‚   β”‚   β”œβ”€β”€ useNotifications.ts # WebSocket consumer for WS /ws system notifications (future)
β”‚   β”‚   β”œβ”€β”€ useAnalytics.ts     # Fetch analytics data
β”‚   β”‚   β”œβ”€β”€ useAuth.ts          # Authentication state
β”‚   β”‚   β”œβ”€β”€ useTheme.ts         # Dark mode toggle
β”‚   β”‚   β”œβ”€β”€ useLocalStorage.ts  # Persist state to localStorage
β”‚   β”‚   β”œβ”€β”€ usePagination.ts    # Pagination logic
β”‚   β”‚   └── useDebounce.ts      # Debounce search input
β”‚   β”‚
β”‚   β”œβ”€β”€ stores/                 # Zustand state management
β”‚   β”‚   β”œβ”€β”€ authStore.ts        # User + auth state
β”‚   β”‚   β”œβ”€β”€ uiStore.ts          # UI state (theme, sidebar open, etc.)
β”‚   β”‚   β”œβ”€β”€ filterStore.ts      # Dashboard filters
β”‚   β”‚   └── workspaceStore.ts   # Workspace selections
β”‚   β”‚
β”‚   β”œβ”€β”€ lib/
β”‚   β”‚   β”œβ”€β”€ api.ts              # TanStack Query setup + HTTP client
β”‚   β”‚   β”œβ”€β”€ http.ts             # httpx client wrapper (JWT refresh)
β”‚   β”‚   β”œβ”€β”€ auth.ts             # JWT helpers, localStorage auth
β”‚   β”‚   β”œβ”€β”€ websocket.ts        # WebSocket manager for alerts
β”‚   β”‚   β”œβ”€β”€ utils.ts            # General utilities (debounce, etc.)
β”‚   β”‚   β”œβ”€β”€ validators.ts       # Input validation (Zod)
β”‚   β”‚   β”œβ”€β”€ constants.ts        # App-wide constants
β”‚   β”‚   β”œβ”€β”€ error-handler.ts    # Centralized error handling
β”‚   β”‚   └── date.ts             # Date formatting helpers
β”‚   β”‚
β”‚   β”œβ”€β”€ types/
β”‚   β”‚   β”œβ”€β”€ index.ts            # Re-export all types
β”‚   β”‚   β”œβ”€β”€ api.ts              # API response types
β”‚   β”‚   β”œβ”€β”€ user.ts             # User + auth types
β”‚   β”‚   β”œβ”€β”€ query.ts            # Query + results types
β”‚   β”‚   β”œβ”€β”€ analytics.ts        # Analytics types
β”‚   β”‚   β”œβ”€β”€ components.ts       # Component prop types
β”‚   β”‚   └── errors.ts           # Error types
β”‚   β”‚
β”‚   β”œβ”€β”€ styles/
β”‚   β”‚   β”œβ”€β”€ globals.css         # Global styles + Tailwind imports
β”‚   β”‚   β”œβ”€β”€ design-tokens.css   # Design tokens (terracotta, white, dark mode)
β”‚   β”‚   β”œβ”€β”€ animations.css      # Custom animations (optional)
β”‚   β”‚   └── responsive.css      # Responsive utility classes
β”‚   β”‚
β”‚   └── config/
β”‚       β”œβ”€β”€ routes.ts           # TanStack Router configuration
β”‚       β”œβ”€β”€ env.ts              # Environment variables + validation
β”‚       └── queryClient.ts      # TanStack Query client config
β”‚
β”œβ”€β”€ public/                     # Static assets
β”‚   β”œβ”€β”€ logo.svg                # Logo
β”‚   β”œβ”€β”€ favicon.ico             # Favicon
β”‚   └── assets/                 # Images, icons
β”‚
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ __mocks__/              # Mock data + API responses
β”‚   β”œβ”€β”€ components/             # Component tests (Vitest + RTL)
β”‚   β”œβ”€β”€ hooks/                  # Hook tests
β”‚   β”œβ”€β”€ utils/                  # Utility tests
β”‚   └── setup.ts                # Vitest + RTL setup
β”‚
β”œβ”€β”€ .eslintrc.json              # ESLint config
β”œβ”€β”€ .prettierrc                 # Prettier config
└── README.md                   # Frontend development guide

Frontend Design Decisions

  1. Vite + React 18: Fast dev, instant HMR, minimal config. No SSR needed for SPA.
  2. TanStack Router: Fully typed routing; better DX than React Router v6.
  3. TanStack Query: Server state management with automatic caching/refetching.
  4. Zustand: Lightweight client state (theme, UI, filters); no Redux boilerplate.
  5. shadcn/ui + Tailwind: Copy-paste components, full control, design tokens system.
  6. Responsive Design: Mobile (320px), Tablet (768px), Desktop (1024px+).
  7. WebSocket: Native API for real-time alerts; no Socket.io overhead.
  8. JWT + httpOnly Cookies: Secure auth; backend validates on every request.

API Contract

Authentication

POST /api/auth/login
β”œβ”€ Request: { email, password } or { sso_provider, sso_token }
β”œβ”€ Response: { access_token, refresh_token, user: { id, email, role, team_id } }
β”œβ”€ Sets httpOnly cookie: __auth_token
└─ Bearer token in Authorization header for all subsequent requests

POST /api/auth/refresh
β”œβ”€ Request: { refresh_token }
β”œβ”€ Response: { access_token }
└─ Auto-called by frontend before token expires

POST /api/auth/logout
β”œβ”€ Clears httpOnly cookie
└─ Backend invalidates refresh token in Redis

Agent Webhook Endpoints (Per-Source)

POST /webhooks/jira
β”œβ”€ Validates Jira webhook signature (X-Atlassian-Webhook-Signature)
β”œβ”€ Extracts issue_created, issue_updated, comment_created events
β”œβ”€ Routes to jira_process_issue Celery task (queue=critical)
└─ Returns immediately (202 Accepted)

POST /webhooks/confluence
β”œβ”€ Validates Confluence webhook signature
β”œβ”€ Extracts page_created, page_updated, page_trashed events
β”œβ”€ Routes to confluence_process_page Celery task (queue=critical)
└─ Returns immediately (202 Accepted)

POST /files/upload
β”œβ”€ Accepts multipart/form-data with file + team_id
β”œβ”€ Routes to file_process_upload Celery task (queue=critical)
β”œβ”€ Returns file_id immediately; processing async
└─ Frontend polls /files/{file_id} for status

POST /jira/sync/{project_key}
β”œβ”€ Manual trigger; requires admin role
β”œβ”€ Routes to jira_sync_project Celery task (queue=polling)
└─ Returns job_id for polling

POST /confluence/sync/{space_key}
β”œβ”€ Manual trigger; requires admin role
β”œβ”€ Routes to confluence_sync_space Celery task (queue=polling)
└─ Returns job_id for polling

Query API (Streaming SSE)

POST /agent/query
β”œβ”€ Request: { query: string, team_id: string, session_id: string }
β”œβ”€ Response: Content-Type: text/event-stream
β”‚  β”œβ”€ event: plan_ready        β†’ { tasks: [AgentTask], reasoning: string }
β”‚  β”œβ”€ event: agent_started     β†’ { agent: "doc_search"|"ticket_lookup"|"live_docs"|"sql_query"|"summariser" }
β”‚  β”œβ”€ event: agent_done        β†’ { agent: string, chunks: [RetrievedChunk], confidence: "high"|"medium"|"low" }
β”‚  β”œβ”€ event: synthesis_started β†’ {}
β”‚  β”œβ”€ event: answer_chunk      β†’ { chunk: string }   (repeats, one per token)
β”‚  β”œβ”€ event: guardrail_result  β†’ { score: float, escalate: bool }
β”‚  β”œβ”€ event: done              β†’ {}
β”‚  └─ event: error             β†’ { message: string }
└─ Headers: Cache-Control: no-cache, X-Accel-Buffering: no

POST /api/query/{query_id}/feedback
β”œβ”€ Request: { sentiment: "helpful"|"not_helpful"|"hallucinated", text?: string }
└─ Response: { success: true }

Knowledge Graph API

GET /graph/nodes?limit=50
└─ Response: { count: int, nodes: [{ label: string, name: string }] }
   (excludes Chunk and Document nodes β€” returns Service/Library/Incident/Team only)

POST /graph/ingest
β”œβ”€ Request: { chunk_ids: [string], team_id: string }
└─ Response: { ingested: int }
   (fetches chunks from Supabase, runs Gemini extraction, upserts to Neo4j)

GET /graph/traverse?type=incident|service|library&name=string&team_id=string
└─ Response: { type, name, team_id, chunks: [string] }
   (multi-hop Cypher traversal β€” returns text chunks for context augmentation)

WS /graph/stream
└─ Streams: node events, edge events, then done event (see Real-Time API above)

Analytics API

GET /api/analytics/queries?date_range=30d&team_id=...
β”œβ”€ Response: {
β”‚    query_count: 1243,
β”‚    unique_users: 243,
β”‚    avg_response_time_ms: 1200,
β”‚    success_rate: 0.76,
β”‚    trend: { data: [{date, count}] }
β”‚  }

GET /api/analytics/knowledge-health
β”œβ”€ Response: {
β”‚    overall_score: 7.2,
β”‚    coverage: 0.68,
β”‚    freshness: 0.82,
β”‚    accuracy: 0.76,
β”‚    accessibility: 0.71,
β”‚    gaps: [{ topic: "ORM patterns", queries: 12, solutions: 0 }]
β”‚  }

GET /api/analytics/dependencies
β”œβ”€ Response: {
β”‚    dependencies: [{name, current_version, latest_version, breaking_changes}],
β”‚    alerts: 3
β”‚  }

Admin API

POST /api/admin/sources
β”œβ”€ Request: { type, config, rbac_level }
β”œβ”€ Response: { id, status, test_result }
└─ Triggers background sync

GET /api/admin/sources
β”œβ”€ Response: [{ id, type, status, last_sync, record_count }]

PATCH /api/admin/sources/{id}
β”œβ”€ Request: { name, config, rbac_level }
β”œβ”€ Response: { updated_source }

DELETE /api/admin/sources/{id}
β”œβ”€ Soft delete; preserves audit trail

---

POST /api/admin/users/invite
β”œβ”€ Request: { emails: ["alice@..."], role, team_id }
β”œβ”€ Response: { invitations: [{ email, invitation_id, expires_at }] }
└─ Sends email invite

GET /api/admin/users
β”œβ”€ Response: [{ id, email, role, team_id, status, created_at }]

DELETE /api/admin/users/{user_id}
β”œβ”€ Deactivates user (no hard delete for compliance)

---

POST /api/admin/rbac
β”œβ”€ Request: { name, description, teams, sources, filters }
β”œβ”€ Response: { id, policy }
└─ Returns doc count matching policy

GET /api/admin/rbac
β”œβ”€ Response: [{ id, name, doc_count }]

PATCH /api/admin/rbac/{id}
β”œβ”€ Update existing policy

---

POST /api/admin/api-keys
β”œβ”€ Request: { name, permissions, rate_limits, expiry }
β”œβ”€ Response: { key: "sk_...", created_at }
└─ Only returned once

GET /api/admin/api-keys
β”œβ”€ Response: [{ name, created_at, last_used, permissions }]

Bash Development Testing

Use these instead of Swagger UI when you need to test streaming behaviour from the terminal.

Test SSE query stream (replaces Swagger β€” Swagger can't stream SSE):

#!/usr/bin/env bash
# test_query.sh β€” streams the SSE response token-by-token to stdout

BASE_URL="${GODSPEED_API:-http://localhost:8000}"

curl -N -s \
  -X POST "${BASE_URL}/agent/query" \
  -H "Content-Type: application/json" \
  -d '{"query":"What is the auth service?","team_id":"team-1","session_id":"test-001"}' \
| while IFS= read -r line; do
    echo "$line"
  done

Test graph REST endpoints:

BASE_URL="${GODSPEED_API:-http://localhost:8000}"

# List all graph nodes
curl -s "${BASE_URL}/graph/nodes?limit=20" | python3 -m json.tool

# Traverse from a service
curl -s "${BASE_URL}/graph/traverse?type=service&name=auth-service&team_id=team-1" \
  | python3 -m json.tool

# Ingest chunks into graph
curl -s -X POST "${BASE_URL}/graph/ingest" \
  -H "Content-Type: application/json" \
  -d '{"chunk_ids":["chunk-abc123"],"team_id":"team-1"}' \
  | python3 -m json.tool

Test WebSocket graph stream (requires wscat β€” install with npm i -g wscat):

BASE_URL="${GODSPEED_WS:-ws://localhost:8000}"
wscat -c "${BASE_URL}/graph/stream"
# Prints node/edge/done events as they arrive

Test Jira webhook signature (bash + openssl):

BASE_URL="${GODSPEED_API:-http://localhost:8000}"
BODY='{"webhookEvent":"jira:issue_created","issue":{"id":"TEST-1","fields":{"summary":"Auth service down"}}}'
SECRET="your_jira_webhook_secret"
SIG="sha256=$(echo -n "${BODY}" | openssl dgst -sha256 -hmac "${SECRET}" | awk '{print $2}')"

curl -s -X POST "${BASE_URL}/webhooks/jira" \
  -H "Content-Type: application/json" \
  -H "X-Atlassian-Webhook-Signature: ${SIG}" \
  -d "${BODY}"

Test file upload:

BASE_URL="${GODSPEED_API:-http://localhost:8000}"
curl -s -X POST "${BASE_URL}/files/upload" \
  -F "file=@/path/to/doc.pdf" \
  -F "team_id=team-1"

Real-Time API

There are two distinct real-time channels β€” do not conflate them:

Channel 1: Query streaming (SSE)

POST /agent/query   β†’   Content-Type: text/event-stream

Emits events in order:
  event: plan_ready        data: { tasks: [...], reasoning: "..." }
  event: agent_started     data: { agent: "doc_search" }
  event: agent_done        data: { agent: "doc_search", chunks: [...], confidence: "high" }
  event: synthesis_started data: {}
  event: answer_chunk      data: { chunk: "token text" }   ← repeats per token
  event: guardrail_result  data: { score: 0.92, escalate: false }
  event: done              data: {}
  event: error             data: { message: "..." }        ← on failure

Request body: { query: string, team_id: string, session_id: string }

Channel 2: Knowledge graph visualization (WebSocket)

WS /graph/stream

Emits in order (50ms delay between each):
  { event: "node", id: "...", label: "Service", name: "auth-service" }
  { event: "edge", from: "...", to: "...", rel: "DEPENDS_ON" }
  ...
  { event: "done", nodes_count: 42, edges_count: 87 }

Channel 3: System notifications (WebSocket)

WS /ws   (future β€” not yet implemented)

Will emit:
  event: "query_answered"  β†’ { query_id, new_docs_count }
  event: "escalation_spike" β†’ { topic, spike_rate }         (manager-only)
  event: "breaking_change"  β†’ { dependency, version, url }  (admin-only)
  event: "data_sync_failed" β†’ { source, error }             (admin-only)
  event: "knowledge_gap"    β†’ { topic, query_count }        (all users)

Data Flow

Flow 1: Engineer Query β†’ Answer

1. Engineer types query in SearchBox
   β”œβ”€ frontend sends POST /agent/query { query, team_id, session_id }
   └─ frontend simultaneously opens WS /graph/stream for parallel graph rendering

2. Backend receives query via SSE stream
   β”œβ”€ LangGraph planner breaks query into AgentTask list β†’ emits plan_ready
   β”œβ”€ Each agent runs (doc_search / ticket_lookup / live_docs) β†’ emits agent_started + agent_done
   β”œβ”€ doc_search: BGE-M3 embed β†’ Qdrant hybrid search (dense+sparse RRF) β†’ top 50 β†’ BGE reranker β†’ top 5
   β”œβ”€ Synthesiser streams answer tokens β†’ emits answer_chunk per token
   └─ Guardrail validates answer against source chunks β†’ emits guardrail_result

3. Guardrail result
   β”œβ”€ guardrail_passed=true β†’ done event
   β”œβ”€ guardrail_passed=false + escalate=true β†’ warning banner shown in frontend
   └─ Citations come from agent_done chunks (already streamed in step 2)

4. Frontend connects to graph stream (parallel to query SSE)
   β”œβ”€ WS /graph/stream streams the pre-built Neo4j graph (query-scoped subgraph)
   β”œβ”€ Nodes arrive one-by-one with 50ms delays: { event:"node", label, name }
   β”œβ”€ Edges arrive after nodes: { event:"edge", from, to, rel }
   └─ { event:"done" } signals completion
   Note: The knowledge graph is pre-built at ingestion time by Gemini 2.5 Pro
   (graph_store/extractor.py), not extracted from the answer at query time.

5. Frontend receives stream
   β”œβ”€ Displays answer immediately (no waiting)
   β”œβ”€ Renders citations as they arrive
   β”œβ”€ Knowledge graph appears once first connection established
   β”œβ”€ Related docs populate as backend fetches
   └─ Full page interactive once final "done" event received

6. Feedback recorded
   β”œβ”€ Engineer clicks thumbs up/down
   β”œβ”€ Frontend POSTs /api/query/{id}/feedback
   β”œβ”€ Backend records sentiment + triggers analytics update
   └─ Feedback visible in query history + aggregated for managers

Flow 2: Data Ingestion (Daily/Polling)

1. Ingestion task triggered
   β”œβ”€ Webhook from source (e.g., Notion) OR Celery periodic task

2. Fetch stage
   β”œβ”€ Adapter queries source API
   β”œβ”€ Detects new/updated items (via timestamps or ETags)
   β”œβ”€ Downloads content

3. Normalize stage (Docling)
   β”œβ”€ Converts PDF/HTML/markdown to clean markdown
   β”œβ”€ Extracts tables as markdown tables
   β”œβ”€ Detects code blocks + language

4. PII Mask stage (GLiNER, local)
   β”œβ”€ Scans text for PII (names, emails, IDs, etc.)
   β”œβ”€ Replaces PII with placeholders (e.g., [REDACTED_EMAIL])
   β”œβ”€ Logs redaction for audit trail

5. Chunk stage (Semantic)
   β”œβ”€ Splits by paragraph/sentence boundaries
   β”œβ”€ Never splits code blocks or lists
   β”œβ”€ 15% overlap between chunks
   β”œβ”€ 256–512 tokens per chunk

6. Tag stage (Metadata)
   β”œβ”€ Adds source_uri, source_type, ingested_at
   β”œβ”€ Adds RBAC tag (public / team / restricted)
   β”œβ”€ Computes content_hash (for change detection)
   β”œβ”€ Detects doc_type (SOP, API doc, PR, etc.)

7. Embed stage
   β”œβ”€ Sends chunks to BGE-M3
   β”œβ”€ Gets 384-dim dense vectors
   β”œβ”€ Extracts sparse BM25-like vectors

8. Index stage
   β”œβ”€ Uploads dense vectors to Qdrant HNSW index
   β”œβ”€ Uploads sparse vectors to Qdrant sparse index
   β”œβ”€ Upserts metadata (PostgreSQL)
   β”œβ”€ Updates Redis cache (last_sync_timestamp)

9. Complete
   β”œβ”€ Backend records sync success in PostgreSQL
   β”œβ”€ Triggers webhook for frontend real-time update
   └─ Notifies admins if errors

Flow 3: Manager Views Analytics

1. Manager navigates to /analytics

2. Frontend loads analytics data
   β”œβ”€ POST /api/analytics/queries?date_range=30d
   β”œβ”€ POST /api/analytics/knowledge-health
   β”œβ”€ POST /api/analytics/dependencies

3. Backend aggregates from event logs
   β”œβ”€ Queries PostgreSQL (query_events table)
   β”œβ”€ Aggregates by date, team, topic
   β”œβ”€ Computes trends, success rates
   β”œβ”€ Identifies gaps from failed queries

4. Frontend renders dashboards
   β”œβ”€ Query trends (Recharts line chart)
   β”œβ”€ Topics (bar chart)
   β”œβ”€ Success rate (gauge)
   β”œβ”€ Escalations table
   β”œβ”€ Knowledge health heatmap

5. Optional: Manager exports report
   β”œβ”€ Frontend POSTs /api/analytics/export?format=pdf
   β”œβ”€ Backend generates PDF via ReportLab
   β”œβ”€ Streams PDF download to browser

Deployment Architecture

Development (docker-compose)

services:
  postgres:
    image: postgres:15
    volumes: [./data/postgres:/var/lib/postgresql/data]
    ports: [5432:5432]

  redis:
    image: redis:7-alpine
    ports: [6379:6379]

  qdrant:
    image: qdrant/qdrant:latest
    volumes: [./data/qdrant:/qdrant/storage]
    ports: [6333:6333]

  backend:
    build: ./backend
    ports: [8000:8000]
    depends_on: [postgres, redis, qdrant]
    environment:
      SQLALCHEMY_DATABASE_URL: postgresql://user:pass@postgres:5432/godspeed
      REDIS_URL: redis://redis:6379
      QDRANT_URL: http://qdrant:6333

  frontend:
    build: ./frontend
    ports: [3000:3000]
    depends_on: [backend]
    environment:
      VITE_API_BASE_URL: http://localhost:8000

  neo4j:
    image: neo4j:5
    ports: ["7474:7474", "7687:7687"]
    volumes: [./data/neo4j:/data]
    environment:
      NEO4J_AUTH: neo4j/godspeed_dev
      NEO4J_PLUGINS: '["apoc"]'

  celery:
    build: ./backend
    command: celery -A src.celery_app worker -Q critical,default,polling -l info
    depends_on: [postgres, redis, qdrant, neo4j]
    environment:
      SQLALCHEMY_DATABASE_URL: postgresql://user:pass@postgres:5432/godspeed
      REDIS_URL: redis://redis:6379
      NEO4J_URI: bolt://neo4j:7687
      NEO4J_USERNAME: neo4j
      NEO4J_PASSWORD: godspeed_dev

Production (Kubernetes)

# Deployments
- backend (FastAPI, 3 replicas, HPA)
- frontend (Nginx, 2 replicas, CDN)
- celery-worker (5 replicas, autoscaling on queue depth)

# StatefulSets
- postgres (with backup via S3)
- redis (cluster mode)
- qdrant (with persistence)

# Services
- backend-svc (ClusterIP)
- frontend-svc (LoadBalancer)
- postgres-svc (ClusterIP)
- redis-svc (ClusterIP)
- qdrant-svc (ClusterIP)

# ConfigMaps & Secrets
- app-config (env vars)
- api-keys (AWS S3, Notion OAuth, etc.)
- tls-certs (HTTPS)

# Ingress
- Routes /api/* to backend
- Routes /* to frontend
- TLS termination

Self-Hosted (Single Server)

nginx (reverse proxy, static frontend)
  β”œβ”€ localhost:8000 (FastAPI backend)
  β”œβ”€ localhost:5432 (PostgreSQL)
  β”œβ”€ localhost:6379 (Redis)
  └─ localhost:6333 (Qdrant)

All services in systemd or Docker containers
Automated backups via Cron + S3
Monitoring via Prometheus + Grafana (optional)

Key Architectural Principles

  1. Separation of Concerns: Each layer (adapter, ingestion, retrieval, agent, API) has one responsibility.
  2. Stateless Backend: FastAPI scales horizontally; state lives in PostgreSQL/Redis.
  3. Async Everywhere: Celery for long-running tasks; FastAPI with asyncio for I/O.
  4. RBAC First: All queries filtered by user's team/permissions at retrieval time.
  5. Streaming Results: Don't wait for complete answer; stream chunks to frontend progressively.
  6. Local PII: GLiNER runs on-premises; zero data egress for compliance.
  7. Cacheable at Every Layer: Embeddings cached, searches cached, answers cached (with refresh policy).
  8. Observable: Structured logging, metrics, traces (OpenTelemetry phase 2).