GodSpeed / Docs /TECHSTACK.md
Ananth Shyam
feat: implement NL-to-SQL agent with PostgreSQL integration and enhance related documentation
825e852

Technical Stack

Document purpose: Complete tech stack decisions for backend, frontend, infrastructure, and deployment. Rationale for each choice included.


Table of Contents

  1. Backend Stack
  2. Frontend Stack
  3. Infrastructure & DevOps
  4. Data Storage & Retrieval
  5. ML / NLP Components
  6. Deployment Targets

Backend Stack

Core Runtime & API Framework

Component Technology Rationale Version
Runtime Python 3.11+ Standard for ML/data pipelines; excellent async support (asyncio); rich data science ecosystem 3.11+
API Framework FastAPI + Uvicorn Native async, automatic OpenAPI docs, built-in dependency injection, <100ms request overhead 0.104+
Async Task Queue Celery + Redis broker Battle-tested for background jobs; Redis for pub/sub and state; supports task retries and rate limiting Celery 5.3+
HTTP Client httpx Async-first, mirrors requests API; used for external doc fetching (Firecrawl, Tavily) 0.25+
Type Hints Pydantic v2 Runtime validation, JSON schema generation, strong typing for API contracts 2.5+

Data Processing & Ingestion

Component Technology Rationale Purpose
Document Parsing Docling Handles multi-format docs (PDF, HTML, markdown); preserves tables and code blocks PDF/HTML/markdown normalization
PII Detection GLiNER (local) Zero egress; runs on-device; detects names, emails, IDs, addresses; GDPR/DPDP compliant Pre-ingestion masking
Chunking Custom semantic chunker (src/ingestion/) Respects paragraph/code block boundaries; 15% overlap; prevents mid-function splits Semantic chunking
Embedding Model BGE-M3 (BAAI) Multilingual dense + sparse in one pass; 8,192 token context; 384-dim output Dense embeddings
Reranker BGE-reranker-v2-m3 Reranks top 50 to top 5; multilingual; efficient Re-ranking after fusion

Vector Store & Semantic Search

Component Technology Rationale Notes
Vector Database Qdrant (self-hosted or cloud) Horizontal scalable; batch upload API; RBAC support; integrated sparse vector index (Rust backend) Production-grade RAG
Sparse Indexing Qdrant's native sparse index BM25-equivalent without separate Elasticsearch; reduces infrastructure Hybrid search
Full-Text Search Qdrant sparse vectors + RRF Reciprocal Rank Fusion merges dense + sparse scores; deterministic, no tuning needed Hybrid RAG

Agentic System & Orchestration

Component Technology Rationale Purpose
Agent Framework LangGraph Stateful graph execution; explicit node definitions; ReAct pattern built-in Multi-agent coordination
LLM Integration langchain-google-genai Unified Gemini interface via LangChain; all agents use Gemini models β€” no OpenAI dependency in production Generator + Critic agents
Primary LLM Gemini 2.5 Pro (Google) Used for planner, synthesiser, and CAG; long context window; best reasoning quality Planner, Synthesiser (PLANNER_MODEL, SYNTHESISER_MODEL)
Fast LLM Gemini 2.5 Flash (Google) Cheap, low-latency inference for light tasks Summariser, Guardrail, Graph extraction, NL-to-SQL translation (SUMMARISER_MODEL, GUARDRAIL_MODEL, GRAPH_EXTRACTION_MODEL)
Extraction LLM Gemini 2.5 Flash (Google) Entity/relationship extraction from chunks at ingestion time; strict output schema graph_store/extractor.py
Local LLM Fallback Ollama + Mistral-7B For air-gapped deployments; lower quality but zero API dependency HIPAA/FedRAMP compliance

Data Storage

Component Technology Rationale Purpose
Primary DB PostgreSQL 15+ (Supabase) ACID guarantees; full-text search via pg_trgm; JSON support; RBAC audit tables; direct connection via asyncpg for NL-to-SQL agent Metadata, audit trails, RBAC, sql_query tool
PostgreSQL Driver asyncpg 0.29+ Native async PostgreSQL driver; used by sql_query tool for direct parameterized query execution NL-to-SQL agent (DATABASE_URL)
Cache Layer Redis (self-hosted or Upstash) Sub-millisecond reads; sorted sets for leaderboards; pub/sub for real-time notifications Session state, caching
Knowledge Graph DB Neo4j (self-hosted) Native graph traversal; Cypher query language; MERGE for idempotent upserts; indexes on chunk_id, doc_id, team_id Entity/relationship graph, multi-hop traversal for context augmentation
Document Storage S3-compatible (AWS S3 or MinIO) Durable long-term storage for PDFs, uploaded files, exports PDFs, user uploads
Time-Series DB (Phase 2) TimescaleDB (PostgreSQL extension) Query volume trends, latency metrics, user activity patterns; built on PostgreSQL Analytics time-series

Integrations & Connectors

Component Technology Rationale Purpose
Notion notion-sdk-py Official SDK; handles pagination, auth refresh Notion workspace sync
Confluence atlassian-python-api Community-maintained; REST v2 support Confluence space sync
GitHub PyGithub Community standard; handles GraphQL fallback; rate limit management GitHub repo + PR sync
Slack slack-sdk Official SDK; event subscriptions, bolt framework for webhooks Slack DM queries, alerts
Jira atlassian-python-api Same maintainer as Confluence; REST v3 Jira issue sync
External Docs Firecrawl (primary) + BeautifulSoup4 (fallback) Firecrawl handles JS-rendered SPAs; BS4 fallback for static sites Live doc fetching
Web Search Tavily API Specialized for RAG; better results than Google for documentation Web search fallback

Development & Testing

Component Technology Rationale Purpose
Testing Framework pytest + pytest-asyncio De-facto standard; strong fixture support; async test support Unit + integration tests
Test Coverage Coverage.py Track line and branch coverage; integrates with pytest Coverage reporting
Linting Ruff Ultra-fast (Rust-based); replaces flake8 + isort Code quality
Type Checking pyright Strict mode by default; catches more errors than mypy Type safety
Code Formatting Black Deterministic; removes style debates Code style

Frontend Stack

Core Framework & Build

Component Technology Rationale Version
Framework React 18 + TypeScript Component-driven; strong typing; ecosystem; learning curve justified by team size 18.2+
Build Tool Vite 10x faster dev startup than webpack; near-instant HMR; minimal config 5.0+
Type Safety TypeScript 5 End-to-end type safety; integrates with Vite; strict mode enforced 5.3+
Routing TanStack Router Fully typed routing; nested routes; TypeScript-first; smaller bundle than React Router v6 1.28+
HTTP Client TanStack Query (React Query) Server state management; automatic caching/invalidation; background refetching 5.28+
Client State Zustand Lightweight; no boilerplate; TypeScript-first; DevTools support 4.4+
Forms React Hook Form + Zod Minimal re-renders; schema-driven validation; TypeScript support; integrates with shadcn 7.0+ / 3.22+

UI Components & Styling

Component Technology Rationale Purpose
Component Library shadcn/ui (Radix primitives) Unstyled accessibility-first primitives; copy-paste model (full control); Tailwind-based UI components
Styling Tailwind CSS 3 Utility-first; design tokens system; dark mode support; minimal CSS bloat Styling system
Design Tokens Tailwind config + CSS variables Terracotta/White primary palette; dark mode support; custom spacing/colors Design consistency
Icons Lucide React 1,500+ modern icons; tree-shakeable; TypeScript support Icon set
Animations Framer Motion (opt-in) Smooth transitions where needed; not required for MVP; zero-config for basics Micro-interactions

Data Visualization & Tables

Component Technology Rationale Purpose
Charts Recharts React-first; responsive; TypeScript support; integrates well with Tailwind Query analytics graphs
Data Tables TanStack React Table (headless) Headless table library; sorting/filtering/pagination built-in; zero styling opinions; composable Query history, data tables
Real-Time Graph Force-Graph 2D (canvas-based) WebGL rendering; 100k+ nodes performant; knowledge graph visualization Knowledge graph display
Time-Series Recharts (LineChart) Sufficient for phase 1; can swap to Apache ECharts if needed Query volume trends

Authentication & Security

Component Technology Rationale Purpose
Auth Flow JWT + refresh tokens (backend-issued) Stateless; supports mobile/SPA; backend validates on each request User authentication
Storage httpOnly cookies (primary) + sessionStorage fallback Protects against XSS (httpOnly); CORS-aware Token storage
RBAC Enforcement Client-side + backend validation Client-side for UX (show/hide features); backend for security (always validate) Role-based access
HTTPS TLS 1.3+ Enforced in production; self-signed in dev (certbot for self-hosted) Transport security

Real-Time Communication

Component Technology Rationale Purpose
SSE (Server-Sent Events) Native EventSource / fetch with text/event-stream Used for query streaming: plan_ready, agent_started, agent_done, answer_chunk, done, error β€” each event arrives as tokens stream from the LLM Query result streaming from POST /agent/query
WebSocket Client Native WebSocket API (no Socket.io for MVP) Two uses: (1) knowledge graph visualization β€” streams nodes/edges from WS /graph/stream as Neo4j traversal results arrive; (2) real-time system notifications Graph visualization + alerts
Message Format JSON over SSE/WebSocket SSE events: data: {...}\n\n; WebSocket: {event, id, label, name} / {event, from, to, rel} Real-time updates
Reconnection Exponential backoff (client-side) Handles network hiccups; prevents thundering herd on backend Connection resilience

Build & Development

Component Technology Rationale Purpose
Package Manager pnpm Faster than npm; lower disk usage; monorepo-friendly Dependency management
Linting ESLint + @typescript-eslint Enforce code quality; catch common mistakes; TypeScript-aware Code quality
Formatting Prettier Opinionated; removes style debates; integrates with ESLint Code formatting
Testing Vitest + React Testing Library Vitest is Vite-native (instant); RTL tests real behavior, not implementation Unit + integration tests
E2E Testing Playwright No JavaScript setup required; cross-browser; fast; integrates with CI End-to-end tests
Environment Management .env.local + .env.example Vite built-in support; clear documentation of required vars Config management

Deployment

Component Technology Rationale Purpose
Build Output Static SPA (dist/) No SSR needed; served by CDN or simple HTTP server; cacheable assets Production build
Hosting (Self-Hosted) Nginx reverse proxy + Docker Minimal footprint; serves static assets + proxies /api to backend Self-hosted deployment
Hosting (Cloud) Vercel or Netlify (optional) One-command deploy; auto-staging; analytics; functions for edge logic Cloud deployment
Docker Node.js 20-alpine (builder) + nginx:alpine (runtime) Multi-stage build; minimal final image size; nginx for efficient static serving Containerized deployment

Infrastructure & DevOps

Containerization

Component Technology Rationale Purpose
Container Runtime Docker Industry standard; Dockerfile for reproducible builds Containerization
Orchestration Kubernetes (optional for scale) Docker Compose for dev/single-server; K8s for multi-region scale Container orchestration
Image Registry Docker Hub or private ECR Public for OSS; private for enterprise images Image storage

CI/CD Pipeline

Component Technology Rationale Purpose
CI Platform GitHub Actions Free for public repos; integrated with GitHub; sufficient for MVP Automation
Stages Lint β†’ Test β†’ Build β†’ Deploy Standard pipeline; catches issues early; reproducible builds Code quality gates
Artifact Storage GitHub Artifacts (CI) / Docker Hub (images) Temporary CI artifacts; persistent Docker image registry Build artifacts

Monitoring & Observability (Phase 2)

Component Technology Rationale Purpose
Logs Structured JSON logs (backend) + browser console (frontend) ELK stack optional; JSON enables machine parsing Log aggregation
Metrics Prometheus-compatible endpoint (backend) Counterpart Grafana dashboards; industry standard Metrics collection
Tracing OpenTelemetry (optional) Distributed tracing for multi-service requests Request tracing
Alerts Prometheus AlertManager + PagerDuty (optional) Route critical alerts to on-call Alert routing

Data Storage & Retrieval

Vector Embeddings Pipeline

Stage Technology Input Output
Encode Dense BGE-M3 (BAAI) Text chunks (up to 8,192 tokens) 384-dim dense vectors
Encode Sparse BGE-M3 built-in Text chunks Sparse BM25-like vectors
Index Dense Qdrant HNSW 384-dim vectors HNSW graph (fast ANN search)
Index Sparse Qdrant sparse index Sparse vectors Inverted index
Fuse Rankings RRF (Reciprocal Rank Fusion) Top 50 dense + Top 50 sparse Top 50 merged by score
Rerank BGE-reranker-v2-m3 Top 50 fused results + query Top 5 re-ranked
Compress Context LLMContextCompress (custom) Top 5 chunks (2–3 pages) Summarised, deduplicated context

Similarity Metrics

Query Type Metric Notes
Semantic search (dense) Cosine similarity Default; works well for queries with paraphrasing
Keyword search (sparse) BM25-equivalent Catches exact phrase matches
Hybrid (both) RRF Merges dense + sparse scores deterministically

ML / NLP Components

Transformer Models (Hosted Locally or via API)

Model Purpose Provider Size Cost
BGE-M3 Dense + Sparse embedding Hugging Face (self-host) 568M Free
BGE-reranker-v2-m3 Re-ranking Hugging Face (self-host) 1.2B Free
GLiNER (base) PII detection Hugging Face (self-host) 333M Free
Gemini 2.5 Pro Planner + Synthesiser LLM Google AI API (GOOGLE_API_KEY) Hosted Pay-per-token
Gemini 2.5 Flash Summariser + Guardrail + Graph extraction Google AI API (GOOGLE_API_KEY) Hosted Pay-per-token (cheaper)
Mistral-7B Air-gapped fallback Ollama (self-host) 7B Free (inference)

NLP Pipelines (Python)

Component Library Purpose
Tokenization spaCy v3 Sentence tokenization, POS tagging for chunking logic
Text Preprocessing NLTK / spaCy Stop word removal, lemmatization for sparse indexing
Named Entity Recognition GLiNER + spaCy PII detection (GLiNER), domain entities (spaCy)
Semantic Similarity BGE-M3 locally Chunk-to-chunk similarity for deduplication

Deployment Targets

Target 1: Self-Hosted (Single Server)

Stack:

  • PostgreSQL 15 (primary DB)
  • Redis (cache + queues)
  • Qdrant (vector DB)
  • FastAPI backend (single process + Gunicorn)
  • React SPA frontend (Nginx)
  • Ollama (optional, for air-gapped)

Deployment:

docker-compose up -d  # Prod-grade docker-compose.yml with all services

Hardware: 8 vCPU, 32GB RAM, 500GB SSD (scales to 4TB for large deployments)

Target 2: Kubernetes Multi-Region

Services:

  • PostgreSQL (managed RDS / Cloud SQL) + read replicas
  • Redis Cluster (Upstash / ElastiCache)
  • Qdrant cluster (Helm chart, S3 backup)
  • FastAPI service (HPA auto-scaling)
  • React frontend (CDN + load balancer)

Deployment: Helm charts (phase 2)

Target 3: Hybrid Air-Gapped (No External APIs)

Fallbacks:

  • Ollama (Mistral-7B) instead of Gemini API
  • Local embeddings (BGE-M3 on CPU or GPU)
  • Self-hosted Firecrawl instance (optional)
  • No external search (Tavily fallback disabled)

Data Compliance: GDPR/HIPAA/FedRAMP ready


Summary: MVP Tech Stack Checklist

  • Backend: FastAPI + PostgreSQL + Redis + Qdrant
  • LLM: Gemini 2.5 Pro (planner + synthesiser) + Gemini 2.5 Flash (summariser + guardrail + graph extraction)
  • Frontend: React 18 + TypeScript + Vite + shadcn/ui
  • Ingestion: Docling + GLiNER + semantic chunking
  • Retrieval: BGE-M3 + RRF + BGE-reranker-v2-m3
  • NL-to-SQL: sql_query agent (Gemini Flash β†’ validated SELECT β†’ asyncpg)
  • Validation: Generator + Critic (LangGraph)
  • Real-Time: SSE (query streaming) + WebSocket (graph visualization + notifications) + Redis pub/sub
  • Knowledge Graph: Neo4j + Gemini 2.5 Pro extraction (graph_store/)
  • Auth: JWT + httpOnly cookies
  • Deployment: Docker + docker-compose (MVP) + Kubernetes (scale)

Phase 2 Additions

  • Monitoring: Prometheus + Grafana
  • Tracing: OpenTelemetry + Jaeger
  • Advanced Auth: OAuth2 + SSO (OIDC)
  • Caching Strategy: Edge caching via CDN
  • Rate Limiting: Redis-based token bucket
  • Webhook Signing: HMAC-SHA256 (Slack, GitHub, Jira) β€” already implemented in src/integrations/webhooks.py