senti-beta / docs /ARCHITECTURE.md
joseph njoroge kariuki
Deploy Senti AI to Hugging Face Spaces
021e065
# Senti AI β€” Architecture
## Overview
Senti AI is a **Kenya-focused financial intelligence platform** that combines deterministic Rust computation with AI-powered language understanding. It answers financial questions, performs tax calculations, manages budgets, and provides financial advice β€” all in English and Swahili.
**Core principle**: Financial math is never delegated to the LLM. All calculations go through a deterministic Rust engine. The LLM is used only for language understanding and response formatting.
## Architecture Diagram
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Frontend │───▢│ FastAPI │───▢│ Pipeline Router β”‚
β”‚ (React) β”‚ β”‚ Backend β”‚ β”‚ (core/pipeline/router.py) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
β”‚ Safety β”‚ β”‚ Intent β”‚ β”‚ Tier β”‚
β”‚ Check β”‚ β”‚ Classify β”‚ β”‚ Route β”‚
β”‚ (empathy) β”‚ β”‚ (determin.)β”‚ β”‚ (A/B/C) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚ CAG β”‚ β”‚ MoE β”‚ β”‚ KAG β”‚
β”‚ (context β”‚ β”‚ (expert β”‚ β”‚ (knowledge β”‚
β”‚ packing) β”‚ β”‚ routing) β”‚ β”‚ graph) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚ Compute β”‚ β”‚ Inference β”‚ β”‚ RAG β”‚
β”‚ (Rust β”‚ β”‚ Engine β”‚ β”‚ (vector β”‚
β”‚ engine) β”‚ β”‚ (3 levels) β”‚ β”‚ search) β”‚
β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ LLM Layer β”‚
β”‚ β”‚ (formatting) β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
β”‚Validatorβ”‚
β”‚(output β”‚
β”‚ check) β”‚
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”
β”‚ Audit β”‚
β”‚ Log β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Module Map
### `core/` β€” Domain Logic
| Module | Path | Purpose |
|---|---|---|
| **Auth** | `core/auth.py` | JWT authentication, password hashing |
| **Intent Classifier** | `core/brain/routing/intent_classifier.py` | Deterministic intent detection (rules + keywords) |
| **MoE Router** | `core/brain/moe/router.py` | Routes to specialist experts (Tax, Investment, Credit, Planning, Safety, Language) |
| **CAG** | `core/brain/memory/cag.py` | Context-Augmented Generation β€” warm context packing |
| **RAG** | `core/brain/retrieval/rag.py` | Retrieval-Augmented Generation β€” vector/keyword search |
| **KAG** | `core/brain/knowledge_graph/` | Knowledge graph reasoning (`finance_kg`) |
| **Inference Engine** | `core/brain/inference/engine.py` | 3-level reasoning depth (fast/standard/deep) |
| **Safety** | `core/brain/safety/empathy_interceptor.py` | Crisis/distress detection with helpline referral |
| **Output Validator** | `core/brain/llm/validator.py` | Verifies LLM output matches Rust computation |
| **Memory Agent** | `core/brain/memory/agent.py` | MemGPT-inspired user memory (core + archival) |
| **Compute Engine** | `core/pipeline/compute.py` | Bridge to Rust `senti_calc` formulas |
| **Pipeline Router** | `core/pipeline/router.py` | Main orchestrator β€” the "brain" |
| **Formula Registry** | `core/engines/formulas/registry.py` | Single source of truth for all finance math |
| **Scheduler** | `core/workflow/scheduler.py` | APScheduler-based recurring jobs |
### `backend/` β€” API and Infrastructure
| Module | Path | Purpose |
|---|---|---|
| **API** | `backend/api/main.py` | FastAPI app, routes, middleware |
| **Config** | `backend/config/settings.py` | Environment-driven configuration |
| **Postgres** | `backend/database/postgres/` | SQLAlchemy models, session management |
| **Redis** | `backend/database/redis/` | Cache, rate limiting, sessions |
| **Qdrant** | `backend/database/vector/` | Vector store for RAG |
| **Hermes** | `backend/api/hermes/` | Payment intelligence endpoints |
| **Institutional** | `backend/api/institutional/` | Enterprise/tenant API |
### `senti_calc/` β€” Rust Math Engine
PyO3 Rust extension for deterministic financial calculations. Compiles to a Python `.pyd` module. Includes:
- Kenya PAYE 2024 brackets
- Turnover Tax (TOT)
- VAT, NSSF, SHA, Housing Levy
- Loan EMI (reducing balance)
- NPV, IRR, CAGR, ROI
- Monte Carlo simulation (10,000 scenarios in <1s)
- Business valuation, working capital, breakeven
### `frontend/` β€” React UI
Vite + React + TypeScript with Tailwind CSS. Features:
- Chat-centric interface
- 3-pane institutional dashboard
- Context-aware right panel
## Request Flow
1. **HTTP Request** β†’ FastAPI endpoint
2. **Auth** β†’ JWT validation via `get_current_user`
3. **Safety Check** β†’ `EmpathyInterceptor` (crisis/distress/illegal detection)
4. **Intent Classification** β†’ Deterministic rules-based classifier
5. **Tier Routing** β†’ A (fast), B (standard), C (RAG-augmented)
6. **CAG** β†’ Pack warm context (user profile, financial state)
7. **MoE** β†’ Select specialist expert(s)
8. **Compute** β†’ Dispatch to Rust engine for calculations
9. **Inference** β†’ Level 1/2/3 reasoning depth
10. **LLM** β†’ Format response (if `USE_LM=true`)
11. **Validator** β†’ Check numbers match, add disclaimers
12. **Audit** β†’ Log request/response to `AuditLog`
13. **Memory** β†’ Update core memory with new facts
14. **Response** β†’ Return to user
## Data Flow
```
Financial Truth
──────────────
User Query ──▢ Rust Engine ──▢ Numbers (deterministic)
β”‚
β–Ό
LLM Formatting ──▢ Natural Language Response
β”‚
Validator Check ──▢ Numbers in response match Rust?
```
**Rule**: The LLM never invents financial numbers. All numbers must originate from the Rust engine.