GodSpeed / Docs /ARCHITECTURE.md
Ananth Shyam
feat: implement NL-to-SQL agent with PostgreSQL integration and enhance related documentation
825e852
# Complete System Architecture
> **Document purpose:** System-wide architecture covering backend, frontend, database, deployment, and all component interactions. Read this to understand how all pieces fit together.
---
## Table of Contents
1. [High-Level System Diagram](#high-level-system-diagram)
2. [Backend Architecture (src/)](#backend-architecture-src)
3. [Frontend Architecture (frontend/)](#frontend-architecture-frontend)
4. [API Contract](#api-contract)
5. [Data Flow](#data-flow)
6. [Deployment Architecture](#deployment-architecture)
---
## High-Level System Diagram
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ EXTERNAL DATA SOURCES β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Notion β”‚ β”‚ Confluence β”‚ β”‚ GitHub β”‚ β”‚ Slack β”‚ β”‚ Jira β”‚ β”‚URLs + Firecrawlβ”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ (Webhooks + Polling)
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ BACKEND (Python/FastAPI) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Data Ingestion β”‚ β”‚ RAG + Retrieval β”‚ β”‚ Analytics & Intelligence β”‚ β”‚
β”‚ β”‚ β”œβ”€ Adapters β”‚ β”‚ β”œβ”€ Hybrid search β”‚ β”‚ β”œβ”€ Query events β”‚ β”‚
β”‚ β”‚ β”œβ”€ Docling β”‚ β”‚ β”œβ”€ BGE-M3 β”‚ β”‚ β”œβ”€ Knowledge graph β”‚ β”‚
β”‚ β”‚ β”œβ”€ GLiNER PII β”‚ β”‚ β”œβ”€ Qdrant β”‚ β”‚ └─ Anomaly detection β”‚ β”‚
β”‚ β”‚ └─ Chunking β”‚ β”‚ └─ LLM agents β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β–² β–² β–² β”‚
β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ FastAPI Backend (Uvicorn) β”‚ β”‚
β”‚ β”‚ β”œβ”€ /api/query/* (search + follow-up) β”‚ β”‚
β”‚ β”‚ β”œβ”€ /api/analytics/* (dashboards) β”‚ β”‚
β”‚ β”‚ β”œβ”€ /api/admin/* (data source management) β”‚ β”‚
β”‚ β”‚ └─ /ws (WebSocket for real-time alerts) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Data Layer (PostgreSQL, Qdrant, Neo4j, Redis, S3) β”‚ β”‚
β”‚ β”‚ β”œβ”€ PostgreSQL: Metadata, RBAC, audit trails, queries β”‚ β”‚
β”‚ β”‚ β”œβ”€ Qdrant: Vector embeddings (dense + sparse) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Neo4j: Knowledge graph (Service/Library/Incident/Team) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Redis: Cache, session state, pub/sub, task queues β”‚ β”‚
β”‚ β”‚ └─ S3: PDFs, user uploads, exports β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”‚ (REST API + WebSocket)
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ FRONTEND (React/TypeScript) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Query Interface β”‚ β”‚ Dashboards β”‚ β”‚ Admin UI β”‚ β”‚
β”‚ β”‚ β”œβ”€ Search box β”‚ β”‚ β”œβ”€ Query trends β”‚ β”‚ β”œβ”€ Data source mgmt β”‚ β”‚
β”‚ β”‚ β”œβ”€ Results display β”‚ β”‚ β”œβ”€ Knowledge β”‚ β”‚ β”œβ”€ User management β”‚ β”‚
β”‚ β”‚ β”œβ”€ Citations β”‚ β”‚ β”‚ health β”‚ β”‚ β”œβ”€ RBAC editor β”‚ β”‚
β”‚ β”‚ β”œβ”€ Follow-ups β”‚ β”‚ β”œβ”€ Dependencies β”‚ β”‚ β”œβ”€ API keys β”‚ β”‚
β”‚ β”‚ └─ Knowledge graph β”‚ β”‚ └─ Alerts β”‚ β”‚ └─ System health β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Component Layer (shadcn/ui + Tailwind) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Query & Search components β”‚ β”‚
β”‚ β”‚ β”œβ”€ Chart & data table components (Recharts, TanStack Table) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Knowledge graph visualizer (Force-Graph) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Authentication flow (JWT) β”‚ β”‚
β”‚ β”‚ └─ Real-time notifications (WebSocket) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ State Management (TanStack Query + Zustand) β”‚ β”‚
β”‚ β”‚ β”œβ”€ Server state: Queries, analytics, user data (TanStack Query) β”‚ β”‚
β”‚ β”‚ └─ Client state: UI state, theme, filters (Zustand) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## Backend Architecture (src/) β€” Agent-Based Design
### Core Principle: Per-Source Agents
Rather than generic adapters flowing through a single pipeline, each data source is an **independent agent** with:
- Source-specific authentication & adapters
- Source-optimized chunking (preserves context like Confluence breadcrumbs, Jira comment threading)
- Independent Celery tasks (different polling cadences, priorities)
- Independent FastAPI routers (explicit webhooks like `/webhooks/jira`)
- Self-contained testing (`test_run.py` per agent)
This design ensures **scalability by source**, **operational clarity**, and **production-grade maintainability**.
### Directory Structure
> **Note:** The actual repo layout diverges from early plans. The implemented structure is below. `src/query_engine/` and `src/retrieval/` referenced in earlier design docs do not exist β€” that logic lives in `agent/`. Graph endpoints live in `graph_store/`, not `src/api/graph.py`.
```
agent/ # LangGraph multi-agent query engine (IMPLEMENTED)
β”œβ”€β”€ api.py # POST /agent/query β€” SSE streaming endpoint
β”œβ”€β”€ graph.py # LangGraph build: planner β†’ [doc_search|ticket_lookup|live_docs|sql_query] β†’ join β†’ synthesiser β†’ guardrail
β”œβ”€β”€ models.py # KnowledgeGraphState, QueryInput, ExecutionPlan, AgentResult, RetrievedChunk
β”œβ”€β”€ config.py # LLM + agent config
β”œβ”€β”€ prompts.py # Prompt templates
β”œβ”€β”€ agents/
β”‚ β”œβ”€β”€ planner.py # Breaks query into AgentTask list
β”‚ β”œβ”€β”€ synthesiser.py # Streams answer tokens from top chunks
β”‚ β”œβ”€β”€ guardrail.py # Validates answer against sources; sets escalate flag
β”‚ └── _gemini.py # Gemini client helper (used in planner/synthesiser)
└── tools/
β”œβ”€β”€ doc_search.py # Qdrant hybrid dense+sparse search
β”œβ”€β”€ ticket_lookup.py # Jira-specific retrieval
β”œβ”€β”€ live_docs.py # Firecrawl real-time doc fetching
β”œβ”€β”€ sql_query.py # NL-to-SQL: translates query β†’ validated SELECT β†’ asyncpg execution
└── summariser.py # Context compression before synthesis
graph_store/ # Neo4j knowledge graph (IMPLEMENTED)
β”œβ”€β”€ api.py # GET /graph/nodes, POST /graph/ingest, GET /graph/traverse
β”œβ”€β”€ stream.py # WS /graph/stream β€” streams nodes+edges with 50ms delay
β”œβ”€β”€ extractor.py # Gemini 2.5 Pro entity+relationship extraction (4 types, whitelist rels)
β”œβ”€β”€ writer.py # Async Neo4j MERGE upserts, index creation
β”œβ”€β”€ reader.py # Cypher traversal: incidentβ†’serviceβ†’libraryβ†’chunks
β”œβ”€β”€ models.py # ExtractedEntity, ExtractedRelationship, ExtractionResult
└── config.py # Neo4j connection settings
src/
β”œβ”€β”€ agents_app.py # Combined FastAPI app: all agent routers + Qdrant/Redis init
β”‚
β”œβ”€β”€ jira_agent/ # JIRA ingestion agent (IMPLEMENTED)
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ config.py # JiraAgentConfig β€” JIRA_BASE_URL, JIRA_EMAIL, JIRA_API_TOKEN,
β”‚ β”‚ # JIRA_PROJECT_KEYS (csv), JIRA_WEBHOOK_SECRET, TEAM_ID
β”‚ β”œβ”€β”€ adapter.py # JiraAdapter β€” fetch_issue, fetch_all (JQL), fetch_incremental
β”‚ β”‚ # Basic auth (base64 email:api_token), ADF text extraction
β”‚ β”œβ”€β”€ chunker.py # chunk_jira_issue β†’ chunk 0: issue body, chunks 1..N: comments
β”‚ β”‚ # Preserves thread structure for relation extraction
β”‚ β”œβ”€β”€ pipeline.py # ingest_issue / ingest_project β†’ chunk β†’ PII mask β†’ embed β†’ Qdrant
β”‚ β”‚ # Returns entity graph nodes for real-time streaming
β”‚ β”œβ”€β”€ tasks.py # Celery: jira_process_issue (queue=critical),
β”‚ β”‚ # jira_sync_project (queue=polling)
β”‚ β”œβ”€β”€ router.py # FastAPI: POST /webhooks/jira, POST /jira/sync/{project_key}
β”‚ └── test_run.py # Mock + real runthrough; works without credentials
β”‚
β”œβ”€β”€ confluence_agent/ # Confluence ingestion agent (IMPLEMENTED)
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ config.py # ConfluenceAgentConfig β€” BASE_URL, TOKEN, EMAIL,
β”‚ β”‚ # CONFLUENCE_SPACES (csv), CONFLUENCE_WEBHOOK_SECRET, TEAM_ID
β”‚ β”œβ”€β”€ adapter.py # ConfluenceAdapter β€” fetch_page, fetch_space, fetch_incremental (CQL)
β”‚ β”‚ # REST v2 API with pagination
β”‚ β”œβ”€β”€ chunker.py # chunk_confluence_page β€” BeautifulSoup heading-split + breadcrumbs
β”‚ β”‚ # [Space > Ancestor > Page] prefix on every chunk; tables = 1 chunk each
β”‚ β”‚ # Preserves hierarchy for entity linking
β”‚ β”œβ”€β”€ pipeline.py # ingest_page / ingest_space β†’ chunk β†’ PII mask β†’ embed β†’ Qdrant
β”‚ β”‚ # Returns entity graph nodes
β”‚ β”œβ”€β”€ tasks.py # Celery: confluence_process_page (queue=critical),
β”‚ β”‚ # confluence_sync_space (queue=polling),
β”‚ β”‚ # confluence_periodic_sync (beat, 60 min incremental sync)
β”‚ β”œβ”€β”€ router.py # FastAPI: POST /webhooks/confluence, POST /confluence/sync/{space_key}
β”‚ β”‚ # POST /confluence/search (for admin dashboard)
β”‚ └── test_run.py # Mock + real runthrough; works without credentials
β”‚
β”œβ”€β”€ file_agent/ # File ingestion agent (IMPLEMENTED)
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ config.py # FileAgentConfig β€” UPLOAD_DIR, MAX_FILE_SIZE, ALLOWED_TYPES
β”‚ β”œβ”€β”€ adapter.py # FileAdapter β€” handle PDFs, DOCX, PPTX, TXT
β”‚ β”‚ # Uses docling for multi-format parsing
β”‚ β”œβ”€β”€ chunker.py # chunk_file_document β€” respects document structure (sections, pages)
β”‚ β”œβ”€β”€ pipeline.py # ingest_file β†’ chunk β†’ PII mask β†’ embed β†’ Qdrant
β”‚ β”œβ”€β”€ tasks.py # Celery: file_process_upload (queue=critical)
β”‚ β”œβ”€β”€ router.py # FastAPI: POST /files/upload, GET /files/{file_id}
β”‚ └── test_run.py
β”‚
β”œβ”€β”€ shared/ # Shared utilities (used by all agents)
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ pii_masker.py # GLiNER-based PII detection (local, zero egress)
β”‚ β”œβ”€β”€ embedder.py # BGE-M3 embeddings (local inference)
β”‚ β”œβ”€β”€ qdrant_client.py # Qdrant connection + upsert helpers
β”‚ β”œβ”€β”€ entity_extractor.py # Extract entities/relationships from chunks (used per-agent)
β”‚ β”œβ”€β”€ models.py # Pydantic models (RawDocument, ChunkedDocument, Entity, Graph)
β”‚ └── config.py # Shared config (QDRANT_URL, REDIS_URL, etc.)
β”‚
β”œβ”€β”€ retrieval/ # T1, T2, T3 retrieval layers (shared across queries)
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ hybrid_search.py # T1: Dense + Sparse (RRF fusion) β€” queries Qdrant
β”‚ β”œβ”€β”€ reranker.py # BGE-reranker-v2-m3 integration
β”‚ β”œβ”€β”€ context_compressor.py # Compress top-5 into LLM context
β”‚ β”œβ”€β”€ cag_agent.py # T2: Cache-Augmented Generation (recent syncs)
β”‚ β”œβ”€β”€ live_doc_agent.py # T3: Real-time doc fetching (Firecrawl)
β”‚ └── models.py # Pydantic models for retrieval
β”‚
β”œβ”€β”€ query_engine/ # Query execution (LangGraph-based)
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ generator_agent.py # Generator LLM agent (creates answer from context)
β”‚ β”œβ”€β”€ critic_agent.py # Critic LLM agent (validates against sources)
β”‚ β”œβ”€β”€ orchestrator.py # LangGraph: routes query through retrieval β†’ generation β†’ validation
β”‚ β”œβ”€β”€ streaming.py # Stream answer chunks + citations + graph to frontend
β”‚ └── models.py # Pydantic models for query responses
β”‚
β”œβ”€β”€ redis/ # Redis utilities (shared)
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ cache.py # Caching layer (with TTL)
β”‚ β”œβ”€β”€ queues.py # Task queues (per-agent ingestion, webhook events)
β”‚ β”œβ”€β”€ session_state.py # Query session state
β”‚ β”œβ”€β”€ locks.py # Distributed locks (prevent concurrent agent syncs)
β”‚ └── pubsub.py # Pub/sub for real-time graph updates to frontend (query_id β†’ node)
β”‚
β”œβ”€β”€ api/ # FastAPI main app + shared endpoints
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ auth.py # POST /auth/login, /auth/logout, /auth/refresh
β”‚ β”œβ”€β”€ query.py # POST /api/query (streaming), /api/query/{id}/follow-up
β”‚ β”œβ”€β”€ workspace.py # GET/POST /api/workspace/queries, /saved
β”‚ β”œβ”€β”€ admin.py # GET /api/admin/agents (show all agent statuses)
β”‚ └── graph.py # GET /api/graph/entities, /api/graph/query/{query_id}
β”‚
β”œβ”€β”€ db/ # Database models & utilities
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ models.py # SQLAlchemy models (User, Query, Document, Entity, Graph)
β”‚ β”œβ”€β”€ session.py # Database session management
β”‚ └── init_db.py # Schema initialization
β”‚
β”œβ”€β”€ auth/ # Authentication & authorization
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ jwt_handler.py # JWT encode/decode, token refresh
β”‚ β”œβ”€β”€ oauth.py # OAuth2 + SSO integration (phase 2)
β”‚ β”œβ”€β”€ rbac.py # Role-based access control decorator
β”‚ β”œβ”€β”€ permissions.py # Permission checks
β”‚ └── models.py # User, Role, Permission models
β”‚
β”œβ”€β”€ utils/ # Shared utilities
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ logger.py # Structured logging (JSON)
β”‚ β”œβ”€β”€ metrics.py # Prometheus metrics
β”‚ β”œβ”€β”€ telemetry.py # OpenTelemetry (phase 2)
β”‚ └── exceptions.py # Custom exceptions
β”‚
└── tests/ # Comprehensive test suite
β”œβ”€β”€ __init__.py
β”œβ”€β”€ agents/ # Per-agent tests (JIRA, Confluence, File)
β”œβ”€β”€ retrieval/ # Retrieval pipeline tests
β”œβ”€β”€ query_engine/ # Query generation + validation tests
β”œβ”€β”€ fixtures/ # Pytest fixtures (mock data)
└── integration/ # End-to-end scenarios
```
### Key Backend Design Decisions
1. **Per-Source Agents:** Each source (Jira, Confluence, File) is an independent module with its own adapter, chunker, pipeline, and Celery tasks. This enables source-specific optimization and independent scaling.
2. **Source-Optimized Chunking:**
- Confluence: Preserves `[Space > Ancestor > Page]` hierarchy for entity linking
- Jira: Preserves comment threading for relation extraction
- File: Respects document structure (sections, pages)
- Each source extracts its own entity relationships
3. **Independent Celery Scheduling:**
- `jira_sync_project` β†’ configurable interval (often 1 hour)
- `confluence_periodic_sync` β†’ beat scheduler (60 min incremental)
- `file_process_upload` β†’ immediate (queue=critical)
- Each agent controls its own cadence
4. **PII Masking First:** GLiNER runs in `shared/pii_masker.py` β€” local, zero-egress, runs before Qdrant indexing.
5. **Entity Extraction Per-Agent:** Each pipeline returns a graph of entities + relationships (e.g., Jira: issue→linked_issue, Confluence: page→linked_page). Frontend streams these nodes as they're extracted.
6. **Real-Time Graph Streaming:** Via Redis pub/sub (`query_id β†’ {nodes, edges}`) β€” frontend doesn't wait for full completion.
7. **Redis Everywhere:** Cache, queues, session state, distributed locks, and pub/sub all via Redis.
8. **Hybrid Retrieval (T1):** Dense (BGE-M3) + Sparse (BM25) via RRF β€” queries Qdrant.
---
## Frontend Architecture (frontend/)
### Directory Structure
```
frontend/
β”œβ”€β”€ index.html # Entry HTML (Vite serves this)
β”œβ”€β”€ vite.config.ts # Vite build config
β”œβ”€β”€ tsconfig.json # TypeScript config
β”œβ”€β”€ tailwind.config.ts # Tailwind design tokens + dark mode
β”œβ”€β”€ postcss.config.js # PostCSS + Tailwind plugins
β”œβ”€β”€ package.json # Dependencies + scripts
β”œβ”€β”€ .env.example # Required environment variables
β”‚
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ main.tsx # React app entry point
β”‚ β”œβ”€β”€ App.tsx # Root component + routing
β”‚ β”‚
β”‚ β”œβ”€β”€ components/
β”‚ β”‚ β”œβ”€β”€ common/ # Reusable components
β”‚ β”‚ β”‚ β”œβ”€β”€ Header.tsx # Top nav bar
β”‚ β”‚ β”‚ β”œβ”€β”€ Sidebar.tsx # Left navigation
β”‚ β”‚ β”‚ β”œβ”€β”€ Footer.tsx # Footer
β”‚ β”‚ β”‚ β”œβ”€β”€ Button.tsx # Button variants (from shadcn)
β”‚ β”‚ β”‚ β”œβ”€β”€ Input.tsx # Text input (from shadcn)
β”‚ β”‚ β”‚ β”œβ”€β”€ Card.tsx # Card container
β”‚ β”‚ β”‚ β”œβ”€β”€ Modal.tsx # Modal/dialog
β”‚ β”‚ β”‚ β”œβ”€β”€ Badge.tsx # Status badges
β”‚ β”‚ β”‚ β”œβ”€β”€ Tooltip.tsx # Tooltips
β”‚ β”‚ β”‚ β”œβ”€β”€ Toast.tsx # Toast notifications
β”‚ β”‚ β”‚ └── Loading.tsx # Loading skeleton
β”‚ β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ query/ # Query interface (Engineer primary)
β”‚ β”‚ β”‚ β”œβ”€β”€ SearchBox.tsx # Main search input (Cmd+K support)
β”‚ β”‚ β”‚ β”œβ”€β”€ QueryModal.tsx # Modal for new query
β”‚ β”‚ β”‚ β”œβ”€β”€ QueryHistory.tsx # Query history panel
β”‚ β”‚ β”‚ β”œβ”€β”€ SuggestedTopics.tsx # Related queries
β”‚ β”‚ β”‚ └── QueryFeedback.tsx # Thumbs up/down
β”‚ β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ results/ # Results display + knowledge graph
β”‚ β”‚ β”‚ β”œβ”€β”€ ResultsPage.tsx # Main results container
β”‚ β”‚ β”‚ β”œβ”€β”€ Answer.tsx # Generated answer with citations
β”‚ β”‚ β”‚ β”œβ”€β”€ Citations.tsx # Cited source chunks
β”‚ β”‚ β”‚ β”œβ”€β”€ FollowUp.tsx # Follow-up prompt
β”‚ β”‚ β”‚ β”œβ”€β”€ KnowledgeGraph.tsx # Knowledge graph visualization
β”‚ β”‚ β”‚ β”œβ”€β”€ GraphNode.tsx # Individual node component
β”‚ β”‚ β”‚ β”œβ”€β”€ RelatedDocs.tsx # Related document snippets
β”‚ β”‚ β”‚ └── ShareResults.tsx # Share/export options
β”‚ β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ analytics/ # Dashboards (Manager primary)
β”‚ β”‚ β”‚ β”œβ”€β”€ AnalyticsDashboard.tsx # Main analytics page
β”‚ β”‚ β”‚ β”œβ”€β”€ QueryTrendChart.tsx # Line chart for query volume
β”‚ β”‚ β”‚ β”œβ”€β”€ TopicsChart.tsx # Bar chart for topics
β”‚ β”‚ β”‚ β”œβ”€β”€ SuccessRateGauge.tsx # Gauge chart
β”‚ β”‚ β”‚ β”œβ”€β”€ KnowledgeHealthDashboard.tsx # Health metrics
β”‚ β”‚ β”‚ β”œβ”€β”€ DependencyTracker.tsx # Breaking changes table
β”‚ β”‚ β”‚ β”œβ”€β”€ EscalationTable.tsx # Unresolved queries
β”‚ β”‚ β”‚ β”œβ”€β”€ TeamSettings.tsx # Team configuration
β”‚ β”‚ β”‚ └── AnalyticsExport.tsx # Export reports
β”‚ β”‚ β”‚
β”‚ β”‚ β”œβ”€β”€ admin/ # Admin UI (Admin primary)
β”‚ β”‚ β”‚ β”œβ”€β”€ AdminDashboard.tsx # Main admin page
β”‚ β”‚ β”‚ β”œβ”€β”€ SystemHealth.tsx # Health status cards
β”‚ β”‚ β”‚ β”œβ”€β”€ DataSourceManager.tsx # Add/edit sources
β”‚ β”‚ β”‚ β”œβ”€β”€ DataSourceForm.tsx # Source configuration wizard
β”‚ β”‚ β”‚ β”œβ”€β”€ UserManager.tsx # User list + invite
β”‚ β”‚ β”‚ β”œβ”€β”€ RBACEditor.tsx # RBAC policy editor
β”‚ β”‚ β”‚ β”œβ”€β”€ APIKeyManager.tsx # Generate/revoke keys
β”‚ β”‚ β”‚ └── SystemLogs.tsx # View logs + alerts
β”‚ β”‚ β”‚
β”‚ β”‚ └── auth/ # Authentication UI
β”‚ β”‚ β”œβ”€β”€ LoginPage.tsx # Login form (SSO + fallback)
β”‚ β”‚ β”œβ”€β”€ SSORedirect.tsx # OAuth callback handler
β”‚ β”‚ └── ProtectedRoute.tsx # Route guard
β”‚ β”‚
β”‚ β”œβ”€β”€ pages/ # Route pages (using TanStack Router)
β”‚ β”‚ β”œβ”€β”€ Home.tsx # Dashboard home
β”‚ β”‚ β”œβ”€β”€ QueryPage.tsx # Query results page
β”‚ β”‚ β”œβ”€β”€ AnalyticsPage.tsx # Analytics dashboards
β”‚ β”‚ β”œβ”€β”€ AdminPage.tsx # Admin dashboards
β”‚ β”‚ β”œβ”€β”€ WorkspacePage.tsx # Personal/team workspace
β”‚ β”‚ β”œβ”€β”€ NotFoundPage.tsx # 404 page
β”‚ β”‚ └── ErrorPage.tsx # Error boundary
β”‚ β”‚
β”‚ β”œβ”€β”€ hooks/ # Custom React hooks
β”‚ β”‚ β”œβ”€β”€ useSSEStream.ts # SSE consumer for POST /agent/query β€” manages fetch + ReadableStream parsing
β”‚ β”‚ β”œβ”€β”€ useGraphStream.ts # WebSocket consumer for WS /graph/stream β€” feeds Force-Graph 2D progressively
β”‚ β”‚ β”œβ”€β”€ useNotifications.ts # WebSocket consumer for WS /ws system notifications (future)
β”‚ β”‚ β”œβ”€β”€ useAnalytics.ts # Fetch analytics data
β”‚ β”‚ β”œβ”€β”€ useAuth.ts # Authentication state
β”‚ β”‚ β”œβ”€β”€ useTheme.ts # Dark mode toggle
β”‚ β”‚ β”œβ”€β”€ useLocalStorage.ts # Persist state to localStorage
β”‚ β”‚ β”œβ”€β”€ usePagination.ts # Pagination logic
β”‚ β”‚ └── useDebounce.ts # Debounce search input
β”‚ β”‚
β”‚ β”œβ”€β”€ stores/ # Zustand state management
β”‚ β”‚ β”œβ”€β”€ authStore.ts # User + auth state
β”‚ β”‚ β”œβ”€β”€ uiStore.ts # UI state (theme, sidebar open, etc.)
β”‚ β”‚ β”œβ”€β”€ filterStore.ts # Dashboard filters
β”‚ β”‚ └── workspaceStore.ts # Workspace selections
β”‚ β”‚
β”‚ β”œβ”€β”€ lib/
β”‚ β”‚ β”œβ”€β”€ api.ts # TanStack Query setup + HTTP client
β”‚ β”‚ β”œβ”€β”€ http.ts # httpx client wrapper (JWT refresh)
β”‚ β”‚ β”œβ”€β”€ auth.ts # JWT helpers, localStorage auth
β”‚ β”‚ β”œβ”€β”€ websocket.ts # WebSocket manager for alerts
β”‚ β”‚ β”œβ”€β”€ utils.ts # General utilities (debounce, etc.)
β”‚ β”‚ β”œβ”€β”€ validators.ts # Input validation (Zod)
β”‚ β”‚ β”œβ”€β”€ constants.ts # App-wide constants
β”‚ β”‚ β”œβ”€β”€ error-handler.ts # Centralized error handling
β”‚ β”‚ └── date.ts # Date formatting helpers
β”‚ β”‚
β”‚ β”œβ”€β”€ types/
β”‚ β”‚ β”œβ”€β”€ index.ts # Re-export all types
β”‚ β”‚ β”œβ”€β”€ api.ts # API response types
β”‚ β”‚ β”œβ”€β”€ user.ts # User + auth types
β”‚ β”‚ β”œβ”€β”€ query.ts # Query + results types
β”‚ β”‚ β”œβ”€β”€ analytics.ts # Analytics types
β”‚ β”‚ β”œβ”€β”€ components.ts # Component prop types
β”‚ β”‚ └── errors.ts # Error types
β”‚ β”‚
β”‚ β”œβ”€β”€ styles/
β”‚ β”‚ β”œβ”€β”€ globals.css # Global styles + Tailwind imports
β”‚ β”‚ β”œβ”€β”€ design-tokens.css # Design tokens (terracotta, white, dark mode)
β”‚ β”‚ β”œβ”€β”€ animations.css # Custom animations (optional)
β”‚ β”‚ └── responsive.css # Responsive utility classes
β”‚ β”‚
β”‚ └── config/
β”‚ β”œβ”€β”€ routes.ts # TanStack Router configuration
β”‚ β”œβ”€β”€ env.ts # Environment variables + validation
β”‚ └── queryClient.ts # TanStack Query client config
β”‚
β”œβ”€β”€ public/ # Static assets
β”‚ β”œβ”€β”€ logo.svg # Logo
β”‚ β”œβ”€β”€ favicon.ico # Favicon
β”‚ └── assets/ # Images, icons
β”‚
β”œβ”€β”€ tests/
β”‚ β”œβ”€β”€ __mocks__/ # Mock data + API responses
β”‚ β”œβ”€β”€ components/ # Component tests (Vitest + RTL)
β”‚ β”œβ”€β”€ hooks/ # Hook tests
β”‚ β”œβ”€β”€ utils/ # Utility tests
β”‚ └── setup.ts # Vitest + RTL setup
β”‚
β”œβ”€β”€ .eslintrc.json # ESLint config
β”œβ”€β”€ .prettierrc # Prettier config
└── README.md # Frontend development guide
```
### Frontend Design Decisions
1. **Vite + React 18:** Fast dev, instant HMR, minimal config. No SSR needed for SPA.
2. **TanStack Router:** Fully typed routing; better DX than React Router v6.
3. **TanStack Query:** Server state management with automatic caching/refetching.
4. **Zustand:** Lightweight client state (theme, UI, filters); no Redux boilerplate.
5. **shadcn/ui + Tailwind:** Copy-paste components, full control, design tokens system.
6. **Responsive Design:** Mobile (320px), Tablet (768px), Desktop (1024px+).
7. **WebSocket:** Native API for real-time alerts; no Socket.io overhead.
8. **JWT + httpOnly Cookies:** Secure auth; backend validates on every request.
---
## API Contract
### Authentication
```
POST /api/auth/login
β”œβ”€ Request: { email, password } or { sso_provider, sso_token }
β”œβ”€ Response: { access_token, refresh_token, user: { id, email, role, team_id } }
β”œβ”€ Sets httpOnly cookie: __auth_token
└─ Bearer token in Authorization header for all subsequent requests
POST /api/auth/refresh
β”œβ”€ Request: { refresh_token }
β”œβ”€ Response: { access_token }
└─ Auto-called by frontend before token expires
POST /api/auth/logout
β”œβ”€ Clears httpOnly cookie
└─ Backend invalidates refresh token in Redis
```
### Agent Webhook Endpoints (Per-Source)
```
POST /webhooks/jira
β”œβ”€ Validates Jira webhook signature (X-Atlassian-Webhook-Signature)
β”œβ”€ Extracts issue_created, issue_updated, comment_created events
β”œβ”€ Routes to jira_process_issue Celery task (queue=critical)
└─ Returns immediately (202 Accepted)
POST /webhooks/confluence
β”œβ”€ Validates Confluence webhook signature
β”œβ”€ Extracts page_created, page_updated, page_trashed events
β”œβ”€ Routes to confluence_process_page Celery task (queue=critical)
└─ Returns immediately (202 Accepted)
POST /files/upload
β”œβ”€ Accepts multipart/form-data with file + team_id
β”œβ”€ Routes to file_process_upload Celery task (queue=critical)
β”œβ”€ Returns file_id immediately; processing async
└─ Frontend polls /files/{file_id} for status
POST /jira/sync/{project_key}
β”œβ”€ Manual trigger; requires admin role
β”œβ”€ Routes to jira_sync_project Celery task (queue=polling)
└─ Returns job_id for polling
POST /confluence/sync/{space_key}
β”œβ”€ Manual trigger; requires admin role
β”œβ”€ Routes to confluence_sync_space Celery task (queue=polling)
└─ Returns job_id for polling
```
### Query API (Streaming SSE)
```
POST /agent/query
β”œβ”€ Request: { query: string, team_id: string, session_id: string }
β”œβ”€ Response: Content-Type: text/event-stream
β”‚ β”œβ”€ event: plan_ready β†’ { tasks: [AgentTask], reasoning: string }
β”‚ β”œβ”€ event: agent_started β†’ { agent: "doc_search"|"ticket_lookup"|"live_docs"|"sql_query"|"summariser" }
β”‚ β”œβ”€ event: agent_done β†’ { agent: string, chunks: [RetrievedChunk], confidence: "high"|"medium"|"low" }
β”‚ β”œβ”€ event: synthesis_started β†’ {}
β”‚ β”œβ”€ event: answer_chunk β†’ { chunk: string } (repeats, one per token)
β”‚ β”œβ”€ event: guardrail_result β†’ { score: float, escalate: bool }
β”‚ β”œβ”€ event: done β†’ {}
β”‚ └─ event: error β†’ { message: string }
└─ Headers: Cache-Control: no-cache, X-Accel-Buffering: no
POST /api/query/{query_id}/feedback
β”œβ”€ Request: { sentiment: "helpful"|"not_helpful"|"hallucinated", text?: string }
└─ Response: { success: true }
```
### Knowledge Graph API
```
GET /graph/nodes?limit=50
└─ Response: { count: int, nodes: [{ label: string, name: string }] }
(excludes Chunk and Document nodes β€” returns Service/Library/Incident/Team only)
POST /graph/ingest
β”œβ”€ Request: { chunk_ids: [string], team_id: string }
└─ Response: { ingested: int }
(fetches chunks from Supabase, runs Gemini extraction, upserts to Neo4j)
GET /graph/traverse?type=incident|service|library&name=string&team_id=string
└─ Response: { type, name, team_id, chunks: [string] }
(multi-hop Cypher traversal β€” returns text chunks for context augmentation)
WS /graph/stream
└─ Streams: node events, edge events, then done event (see Real-Time API above)
```
### Analytics API
```
GET /api/analytics/queries?date_range=30d&team_id=...
β”œβ”€ Response: {
β”‚ query_count: 1243,
β”‚ unique_users: 243,
β”‚ avg_response_time_ms: 1200,
β”‚ success_rate: 0.76,
β”‚ trend: { data: [{date, count}] }
β”‚ }
GET /api/analytics/knowledge-health
β”œβ”€ Response: {
β”‚ overall_score: 7.2,
β”‚ coverage: 0.68,
β”‚ freshness: 0.82,
β”‚ accuracy: 0.76,
β”‚ accessibility: 0.71,
β”‚ gaps: [{ topic: "ORM patterns", queries: 12, solutions: 0 }]
β”‚ }
GET /api/analytics/dependencies
β”œβ”€ Response: {
β”‚ dependencies: [{name, current_version, latest_version, breaking_changes}],
β”‚ alerts: 3
β”‚ }
```
### Admin API
```
POST /api/admin/sources
β”œβ”€ Request: { type, config, rbac_level }
β”œβ”€ Response: { id, status, test_result }
└─ Triggers background sync
GET /api/admin/sources
β”œβ”€ Response: [{ id, type, status, last_sync, record_count }]
PATCH /api/admin/sources/{id}
β”œβ”€ Request: { name, config, rbac_level }
β”œβ”€ Response: { updated_source }
DELETE /api/admin/sources/{id}
β”œβ”€ Soft delete; preserves audit trail
---
POST /api/admin/users/invite
β”œβ”€ Request: { emails: ["alice@..."], role, team_id }
β”œβ”€ Response: { invitations: [{ email, invitation_id, expires_at }] }
└─ Sends email invite
GET /api/admin/users
β”œβ”€ Response: [{ id, email, role, team_id, status, created_at }]
DELETE /api/admin/users/{user_id}
β”œβ”€ Deactivates user (no hard delete for compliance)
---
POST /api/admin/rbac
β”œβ”€ Request: { name, description, teams, sources, filters }
β”œβ”€ Response: { id, policy }
└─ Returns doc count matching policy
GET /api/admin/rbac
β”œβ”€ Response: [{ id, name, doc_count }]
PATCH /api/admin/rbac/{id}
β”œβ”€ Update existing policy
---
POST /api/admin/api-keys
β”œβ”€ Request: { name, permissions, rate_limits, expiry }
β”œβ”€ Response: { key: "sk_...", created_at }
└─ Only returned once
GET /api/admin/api-keys
β”œβ”€ Response: [{ name, created_at, last_used, permissions }]
```
### Bash Development Testing
Use these instead of Swagger UI when you need to test streaming behaviour from the terminal.
**Test SSE query stream (replaces Swagger β€” Swagger can't stream SSE):**
```bash
#!/usr/bin/env bash
# test_query.sh β€” streams the SSE response token-by-token to stdout
BASE_URL="${GODSPEED_API:-http://localhost:8000}"
curl -N -s \
-X POST "${BASE_URL}/agent/query" \
-H "Content-Type: application/json" \
-d '{"query":"What is the auth service?","team_id":"team-1","session_id":"test-001"}' \
| while IFS= read -r line; do
echo "$line"
done
```
**Test graph REST endpoints:**
```bash
BASE_URL="${GODSPEED_API:-http://localhost:8000}"
# List all graph nodes
curl -s "${BASE_URL}/graph/nodes?limit=20" | python3 -m json.tool
# Traverse from a service
curl -s "${BASE_URL}/graph/traverse?type=service&name=auth-service&team_id=team-1" \
| python3 -m json.tool
# Ingest chunks into graph
curl -s -X POST "${BASE_URL}/graph/ingest" \
-H "Content-Type: application/json" \
-d '{"chunk_ids":["chunk-abc123"],"team_id":"team-1"}' \
| python3 -m json.tool
```
**Test WebSocket graph stream (requires `wscat` β€” install with `npm i -g wscat`):**
```bash
BASE_URL="${GODSPEED_WS:-ws://localhost:8000}"
wscat -c "${BASE_URL}/graph/stream"
# Prints node/edge/done events as they arrive
```
**Test Jira webhook signature (bash + openssl):**
```bash
BASE_URL="${GODSPEED_API:-http://localhost:8000}"
BODY='{"webhookEvent":"jira:issue_created","issue":{"id":"TEST-1","fields":{"summary":"Auth service down"}}}'
SECRET="your_jira_webhook_secret"
SIG="sha256=$(echo -n "${BODY}" | openssl dgst -sha256 -hmac "${SECRET}" | awk '{print $2}')"
curl -s -X POST "${BASE_URL}/webhooks/jira" \
-H "Content-Type: application/json" \
-H "X-Atlassian-Webhook-Signature: ${SIG}" \
-d "${BODY}"
```
**Test file upload:**
```bash
BASE_URL="${GODSPEED_API:-http://localhost:8000}"
curl -s -X POST "${BASE_URL}/files/upload" \
-F "file=@/path/to/doc.pdf" \
-F "team_id=team-1"
```
---
### Real-Time API
There are two distinct real-time channels β€” do not conflate them:
**Channel 1: Query streaming (SSE)**
```
POST /agent/query β†’ Content-Type: text/event-stream
Emits events in order:
event: plan_ready data: { tasks: [...], reasoning: "..." }
event: agent_started data: { agent: "doc_search" }
event: agent_done data: { agent: "doc_search", chunks: [...], confidence: "high" }
event: synthesis_started data: {}
event: answer_chunk data: { chunk: "token text" } ← repeats per token
event: guardrail_result data: { score: 0.92, escalate: false }
event: done data: {}
event: error data: { message: "..." } ← on failure
Request body: { query: string, team_id: string, session_id: string }
```
**Channel 2: Knowledge graph visualization (WebSocket)**
```
WS /graph/stream
Emits in order (50ms delay between each):
{ event: "node", id: "...", label: "Service", name: "auth-service" }
{ event: "edge", from: "...", to: "...", rel: "DEPENDS_ON" }
...
{ event: "done", nodes_count: 42, edges_count: 87 }
```
**Channel 3: System notifications (WebSocket)**
```
WS /ws (future β€” not yet implemented)
Will emit:
event: "query_answered" β†’ { query_id, new_docs_count }
event: "escalation_spike" β†’ { topic, spike_rate } (manager-only)
event: "breaking_change" β†’ { dependency, version, url } (admin-only)
event: "data_sync_failed" β†’ { source, error } (admin-only)
event: "knowledge_gap" β†’ { topic, query_count } (all users)
```
---
## Data Flow
### Flow 1: Engineer Query β†’ Answer
```
1. Engineer types query in SearchBox
β”œβ”€ frontend sends POST /agent/query { query, team_id, session_id }
└─ frontend simultaneously opens WS /graph/stream for parallel graph rendering
2. Backend receives query via SSE stream
β”œβ”€ LangGraph planner breaks query into AgentTask list β†’ emits plan_ready
β”œβ”€ Each agent runs (doc_search / ticket_lookup / live_docs) β†’ emits agent_started + agent_done
β”œβ”€ doc_search: BGE-M3 embed β†’ Qdrant hybrid search (dense+sparse RRF) β†’ top 50 β†’ BGE reranker β†’ top 5
β”œβ”€ Synthesiser streams answer tokens β†’ emits answer_chunk per token
└─ Guardrail validates answer against source chunks β†’ emits guardrail_result
3. Guardrail result
β”œβ”€ guardrail_passed=true β†’ done event
β”œβ”€ guardrail_passed=false + escalate=true β†’ warning banner shown in frontend
└─ Citations come from agent_done chunks (already streamed in step 2)
4. Frontend connects to graph stream (parallel to query SSE)
β”œβ”€ WS /graph/stream streams the pre-built Neo4j graph (query-scoped subgraph)
β”œβ”€ Nodes arrive one-by-one with 50ms delays: { event:"node", label, name }
β”œβ”€ Edges arrive after nodes: { event:"edge", from, to, rel }
└─ { event:"done" } signals completion
Note: The knowledge graph is pre-built at ingestion time by Gemini 2.5 Pro
(graph_store/extractor.py), not extracted from the answer at query time.
5. Frontend receives stream
β”œβ”€ Displays answer immediately (no waiting)
β”œβ”€ Renders citations as they arrive
β”œβ”€ Knowledge graph appears once first connection established
β”œβ”€ Related docs populate as backend fetches
└─ Full page interactive once final "done" event received
6. Feedback recorded
β”œβ”€ Engineer clicks thumbs up/down
β”œβ”€ Frontend POSTs /api/query/{id}/feedback
β”œβ”€ Backend records sentiment + triggers analytics update
└─ Feedback visible in query history + aggregated for managers
```
### Flow 2: Data Ingestion (Daily/Polling)
```
1. Ingestion task triggered
β”œβ”€ Webhook from source (e.g., Notion) OR Celery periodic task
2. Fetch stage
β”œβ”€ Adapter queries source API
β”œβ”€ Detects new/updated items (via timestamps or ETags)
β”œβ”€ Downloads content
3. Normalize stage (Docling)
β”œβ”€ Converts PDF/HTML/markdown to clean markdown
β”œβ”€ Extracts tables as markdown tables
β”œβ”€ Detects code blocks + language
4. PII Mask stage (GLiNER, local)
β”œβ”€ Scans text for PII (names, emails, IDs, etc.)
β”œβ”€ Replaces PII with placeholders (e.g., [REDACTED_EMAIL])
β”œβ”€ Logs redaction for audit trail
5. Chunk stage (Semantic)
β”œβ”€ Splits by paragraph/sentence boundaries
β”œβ”€ Never splits code blocks or lists
β”œβ”€ 15% overlap between chunks
β”œβ”€ 256–512 tokens per chunk
6. Tag stage (Metadata)
β”œβ”€ Adds source_uri, source_type, ingested_at
β”œβ”€ Adds RBAC tag (public / team / restricted)
β”œβ”€ Computes content_hash (for change detection)
β”œβ”€ Detects doc_type (SOP, API doc, PR, etc.)
7. Embed stage
β”œβ”€ Sends chunks to BGE-M3
β”œβ”€ Gets 384-dim dense vectors
β”œβ”€ Extracts sparse BM25-like vectors
8. Index stage
β”œβ”€ Uploads dense vectors to Qdrant HNSW index
β”œβ”€ Uploads sparse vectors to Qdrant sparse index
β”œβ”€ Upserts metadata (PostgreSQL)
β”œβ”€ Updates Redis cache (last_sync_timestamp)
9. Complete
β”œβ”€ Backend records sync success in PostgreSQL
β”œβ”€ Triggers webhook for frontend real-time update
└─ Notifies admins if errors
```
### Flow 3: Manager Views Analytics
```
1. Manager navigates to /analytics
2. Frontend loads analytics data
β”œβ”€ POST /api/analytics/queries?date_range=30d
β”œβ”€ POST /api/analytics/knowledge-health
β”œβ”€ POST /api/analytics/dependencies
3. Backend aggregates from event logs
β”œβ”€ Queries PostgreSQL (query_events table)
β”œβ”€ Aggregates by date, team, topic
β”œβ”€ Computes trends, success rates
β”œβ”€ Identifies gaps from failed queries
4. Frontend renders dashboards
β”œβ”€ Query trends (Recharts line chart)
β”œβ”€ Topics (bar chart)
β”œβ”€ Success rate (gauge)
β”œβ”€ Escalations table
β”œβ”€ Knowledge health heatmap
5. Optional: Manager exports report
β”œβ”€ Frontend POSTs /api/analytics/export?format=pdf
β”œβ”€ Backend generates PDF via ReportLab
β”œβ”€ Streams PDF download to browser
```
---
## Deployment Architecture
### Development (docker-compose)
```yaml
services:
postgres:
image: postgres:15
volumes: [./data/postgres:/var/lib/postgresql/data]
ports: [5432:5432]
redis:
image: redis:7-alpine
ports: [6379:6379]
qdrant:
image: qdrant/qdrant:latest
volumes: [./data/qdrant:/qdrant/storage]
ports: [6333:6333]
backend:
build: ./backend
ports: [8000:8000]
depends_on: [postgres, redis, qdrant]
environment:
SQLALCHEMY_DATABASE_URL: postgresql://user:pass@postgres:5432/godspeed
REDIS_URL: redis://redis:6379
QDRANT_URL: http://qdrant:6333
frontend:
build: ./frontend
ports: [3000:3000]
depends_on: [backend]
environment:
VITE_API_BASE_URL: http://localhost:8000
neo4j:
image: neo4j:5
ports: ["7474:7474", "7687:7687"]
volumes: [./data/neo4j:/data]
environment:
NEO4J_AUTH: neo4j/godspeed_dev
NEO4J_PLUGINS: '["apoc"]'
celery:
build: ./backend
command: celery -A src.celery_app worker -Q critical,default,polling -l info
depends_on: [postgres, redis, qdrant, neo4j]
environment:
SQLALCHEMY_DATABASE_URL: postgresql://user:pass@postgres:5432/godspeed
REDIS_URL: redis://redis:6379
NEO4J_URI: bolt://neo4j:7687
NEO4J_USERNAME: neo4j
NEO4J_PASSWORD: godspeed_dev
```
### Production (Kubernetes)
```yaml
# Deployments
- backend (FastAPI, 3 replicas, HPA)
- frontend (Nginx, 2 replicas, CDN)
- celery-worker (5 replicas, autoscaling on queue depth)
# StatefulSets
- postgres (with backup via S3)
- redis (cluster mode)
- qdrant (with persistence)
# Services
- backend-svc (ClusterIP)
- frontend-svc (LoadBalancer)
- postgres-svc (ClusterIP)
- redis-svc (ClusterIP)
- qdrant-svc (ClusterIP)
# ConfigMaps & Secrets
- app-config (env vars)
- api-keys (AWS S3, Notion OAuth, etc.)
- tls-certs (HTTPS)
# Ingress
- Routes /api/* to backend
- Routes /* to frontend
- TLS termination
```
### Self-Hosted (Single Server)
```
nginx (reverse proxy, static frontend)
β”œβ”€ localhost:8000 (FastAPI backend)
β”œβ”€ localhost:5432 (PostgreSQL)
β”œβ”€ localhost:6379 (Redis)
└─ localhost:6333 (Qdrant)
All services in systemd or Docker containers
Automated backups via Cron + S3
Monitoring via Prometheus + Grafana (optional)
```
---
## Key Architectural Principles
1. **Separation of Concerns:** Each layer (adapter, ingestion, retrieval, agent, API) has one responsibility.
2. **Stateless Backend:** FastAPI scales horizontally; state lives in PostgreSQL/Redis.
3. **Async Everywhere:** Celery for long-running tasks; FastAPI with asyncio for I/O.
4. **RBAC First:** All queries filtered by user's team/permissions at retrieval time.
5. **Streaming Results:** Don't wait for complete answer; stream chunks to frontend progressively.
6. **Local PII:** GLiNER runs on-premises; zero data egress for compliance.
7. **Cacheable at Every Layer:** Embeddings cached, searches cached, answers cached (with refresh policy).
8. **Observable:** Structured logging, metrics, traces (OpenTelemetry phase 2).