senti-beta / docs /ARCHITECTURE.md
joseph njoroge kariuki
Deploy Senti AI to Hugging Face Spaces
021e065

Senti AI β€” Architecture

Overview

Senti AI is a Kenya-focused financial intelligence platform that combines deterministic Rust computation with AI-powered language understanding. It answers financial questions, performs tax calculations, manages budgets, and provides financial advice β€” all in English and Swahili.

Core principle: Financial math is never delegated to the LLM. All calculations go through a deterministic Rust engine. The LLM is used only for language understanding and response formatting.

Architecture Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Frontend   │───▢│  FastAPI     │───▢│  Pipeline Router            β”‚
β”‚  (React)    β”‚    β”‚  Backend     β”‚    β”‚  (core/pipeline/router.py)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                  β”‚
                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                          β”‚                       β”‚                       β”‚
                    β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
                    β”‚  Safety    β”‚          β”‚  Intent     β”‚         β”‚  Tier     β”‚
                    β”‚  Check     β”‚          β”‚  Classify   β”‚         β”‚  Route    β”‚
                    β”‚  (empathy) β”‚          β”‚  (determin.)β”‚         β”‚  (A/B/C)  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                  β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚                             β”‚                         β”‚
              β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
              β”‚    CAG     β”‚              β”‚     MoE       β”‚          β”‚    KAG      β”‚
              β”‚  (context  β”‚              β”‚   (expert     β”‚          β”‚  (knowledge β”‚
              β”‚   packing) β”‚              β”‚    routing)   β”‚          β”‚    graph)   β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                 β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚                            β”‚                        β”‚
              β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
              β”‚   Compute  β”‚              β”‚  Inference  β”‚          β”‚    RAG      β”‚
              β”‚  (Rust     β”‚              β”‚   Engine    β”‚          β”‚  (vector    β”‚
              β”‚   engine)  β”‚              β”‚  (3 levels) β”‚          β”‚   search)   β”‚
              β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚                            β”‚
                    β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
                    β”‚                    β”‚   LLM Layer   β”‚
                    β”‚                    β”‚  (formatting) β”‚
                    β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚                            β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
                           β”‚Validatorβ”‚
                           β”‚(output  β”‚
                           β”‚ check)  β”‚
                           β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
                                β”‚
                           β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
                           β”‚  Audit  β”‚
                           β”‚  Log    β”‚
                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Module Map

core/ β€” Domain Logic

Module Path Purpose
Auth core/auth.py JWT authentication, password hashing
Intent Classifier core/brain/routing/intent_classifier.py Deterministic intent detection (rules + keywords)
MoE Router core/brain/moe/router.py Routes to specialist experts (Tax, Investment, Credit, Planning, Safety, Language)
CAG core/brain/memory/cag.py Context-Augmented Generation β€” warm context packing
RAG core/brain/retrieval/rag.py Retrieval-Augmented Generation β€” vector/keyword search
KAG core/brain/knowledge_graph/ Knowledge graph reasoning (finance_kg)
Inference Engine core/brain/inference/engine.py 3-level reasoning depth (fast/standard/deep)
Safety core/brain/safety/empathy_interceptor.py Crisis/distress detection with helpline referral
Output Validator core/brain/llm/validator.py Verifies LLM output matches Rust computation
Memory Agent core/brain/memory/agent.py MemGPT-inspired user memory (core + archival)
Compute Engine core/pipeline/compute.py Bridge to Rust senti_calc formulas
Pipeline Router core/pipeline/router.py Main orchestrator β€” the "brain"
Formula Registry core/engines/formulas/registry.py Single source of truth for all finance math
Scheduler core/workflow/scheduler.py APScheduler-based recurring jobs

backend/ β€” API and Infrastructure

Module Path Purpose
API backend/api/main.py FastAPI app, routes, middleware
Config backend/config/settings.py Environment-driven configuration
Postgres backend/database/postgres/ SQLAlchemy models, session management
Redis backend/database/redis/ Cache, rate limiting, sessions
Qdrant backend/database/vector/ Vector store for RAG
Hermes backend/api/hermes/ Payment intelligence endpoints
Institutional backend/api/institutional/ Enterprise/tenant API

senti_calc/ β€” Rust Math Engine

PyO3 Rust extension for deterministic financial calculations. Compiles to a Python .pyd module. Includes:

  • Kenya PAYE 2024 brackets
  • Turnover Tax (TOT)
  • VAT, NSSF, SHA, Housing Levy
  • Loan EMI (reducing balance)
  • NPV, IRR, CAGR, ROI
  • Monte Carlo simulation (10,000 scenarios in <1s)
  • Business valuation, working capital, breakeven

frontend/ β€” React UI

Vite + React + TypeScript with Tailwind CSS. Features:

  • Chat-centric interface
  • 3-pane institutional dashboard
  • Context-aware right panel

Request Flow

  1. HTTP Request β†’ FastAPI endpoint
  2. Auth β†’ JWT validation via get_current_user
  3. Safety Check β†’ EmpathyInterceptor (crisis/distress/illegal detection)
  4. Intent Classification β†’ Deterministic rules-based classifier
  5. Tier Routing β†’ A (fast), B (standard), C (RAG-augmented)
  6. CAG β†’ Pack warm context (user profile, financial state)
  7. MoE β†’ Select specialist expert(s)
  8. Compute β†’ Dispatch to Rust engine for calculations
  9. Inference β†’ Level 1/2/3 reasoning depth
  10. LLM β†’ Format response (if USE_LM=true)
  11. Validator β†’ Check numbers match, add disclaimers
  12. Audit β†’ Log request/response to AuditLog
  13. Memory β†’ Update core memory with new facts
  14. Response β†’ Return to user

Data Flow

                    Financial Truth
                    ──────────────
User Query ──▢ Rust Engine ──▢ Numbers (deterministic)
                                    β”‚
                                    β–Ό
                              LLM Formatting ──▢ Natural Language Response
                                    β”‚
                              Validator Check ──▢ Numbers in response match Rust?

Rule: The LLM never invents financial numbers. All numbers must originate from the Rust engine.