Agentic-RagBot / docs /ARCHITECTURE.md
Nikhil Pravin Pise
docs: update all documentation to reflect current codebase state
aefac4f

RagBot System Architecture

Overview

RagBot is a Multi-Agent RAG (Retrieval-Augmented Generation) system for medical biomarker analysis. It combines large language models with a specialized medical knowledge base to provide evidence-based insights on patient biomarker readings.

System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     User Interfaces                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚  CLI Chat    β”‚  β”‚  REST API    β”‚  β”‚   Web UI     β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                  β”‚                  β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚    Workflow Orchestrator            β”‚
          β”‚        (LangGraph)                  β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚                  β”‚                  β”‚
      β–Ό                  β–Ό                  β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Extraction β”‚  β”‚   Analysis   β”‚  β”‚  Knowledge   β”‚
  β”‚   Agent     β”‚  β”‚   Agents     β”‚  β”‚  Retrieval   β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      β”‚                  β”‚                  β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚    LLM Provider             β”‚
          β”‚    (Groq - LLaMA 3.3-70B)   β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚    Medical Knowledge Base   β”‚
          β”‚    (FAISS Vector Store)     β”‚
          β”‚    (750 pages, 2,609 docs)  β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

1. Biomarker Extraction & Validation (src/biomarker_validator.py, src/biomarker_normalization.py)

  • Parses user input for blood test results
  • Normalizes biomarker names via 80+ alias mappings to 24 canonical names
  • Validates values against established reference ranges (with clinically appropriate critical thresholds)
  • Generates safety alerts for critical values
  • Flags all out-of-range values (no suppression threshold)

2. Multi-Agent Workflow (src/workflow.py using LangGraph)

The system processes each patient case through 6 specialist agents:

Agent 1: Biomarker Analyzer

  • Validates each biomarker against reference ranges
  • Identifies out-of-range values
  • Generates immediate clinical alerts
  • Predicts disease relevance (baseline diagnostic)

Agent 2: Disease Explainer (RAG)

  • Retrieves medical literature on predicted disease
  • Explains pathophysiological mechanisms
  • Provides evidence-based disease context
  • Sources: medical PDFs (anemia, diabetes, heart disease, thrombocytopenia)

Agent 3: Biomarker-Disease Linker (RAG)

  • Maps patient biomarkers to disease indicators
  • Identifies key drivers of the predicted condition
  • Retrieves lab-specific guidelines
  • Explains biomarker significance in disease context

Agent 4: Clinical Guidelines Agent (RAG)

  • Retrieves evidence-based clinical guidelines
  • Provides immediate recommendations
  • Suggests monitoring parameters
  • Offers lifestyle and medication guidance

Agent 5: Confidence Assessor

  • Evaluates prediction reliability
  • Assesses evidence strength
  • Identifies limitations in analysis
  • Provides confidence score with reasoning

Agent 6: Response Synthesizer

  • Consolidates findings from all agents
  • Generates comprehensive patient summary
  • Produces actionable recommendations
  • Creates structured final report

3. Knowledge Base (src/pdf_processor.py)

  • Source: 8 medical PDF documents (750 pages total)
  • Storage: FAISS vector database (2,609 document chunks)
  • Embeddings: Google Gemini (default, free) or HuggingFace sentence-transformers (local, offline)
  • Format: Chunked with 1000 char overlap for context preservation

4. LLM Configuration (src/llm_config.py)

  • Primary LLM: Groq LLaMA 3.3-70B (fast, free)
  • Alternative LLM: Google Gemini 2.0 Flash (free)
  • Local LLM: Ollama (for offline use)
    • Fast inference (~1-2 sec per agent output)
    • Free API tier available
    • No rate limiting for reasonable usage
  • Embedding Model: HuggingFace sentence-transformers/all-MiniLM-L6-v2
    • 384-dimensional embeddings
    • Fast similarity search
    • Runs locally (no API dependency)

Data Flow

User Input
    ↓
[Extraction] β†’ Normalized Biomarkers
    ↓
[Prediction] β†’ Disease Hypothesis (85% confidence)
    ↓
[RAG Retrieval] β†’ Medical Literature (5-10 relevant docs)
    ↓
[Analysis] β†’ All 6 Agents Process in Parallel
    ↓
[Synthesis] β†’ Comprehensive Report
    ↓
[Output] β†’ Recommendations + Safety Alerts + Evidence

Key Design Decisions

  1. Cloud Embeddings: Google Gemini embeddings (free tier) with HuggingFace fallback for offline use
  2. Groq LLM: Free, fast inference for real-time interaction
  3. Multiple Providers: Support for Groq, Google Gemini, and Ollama (local)
  4. LangGraph: Manages complex multi-agent workflows with state management
  5. FAISS: Efficient similarity search on large medical document collection
  6. Modular Agents: Each agent has clear responsibility, enabling parallel execution
  7. RAG Integration: Medical knowledge grounds responses in evidence
  8. Biomarker Normalization: 80+ aliases ensure robust input handling

Technologies Used

Component Technology Purpose
Orchestration LangGraph Workflow management
LLM Groq API / Google Gemini Fast inference
Embeddings Google Gemini / HuggingFace Vector representations
Vector DB FAISS Similarity search
Data Validation Pydantic V2 Type safety & schemas
REST API FastAPI Web interface

Performance Characteristics

  • Response Time: 15-25 seconds (6 agents + RAG retrieval)
  • Knowledge Base Size: 750 pages, 2,609 chunks
  • Embedding Dimensions: 384
  • Inference Cost: Free (local embeddings + Groq free tier)
  • Scalability: Easily extends to more medical domains

Extensibility

Adding New Biomarkers

  1. Update config/biomarker_references.json with reference ranges
  2. Add aliases to src/biomarker_normalization.py (NORMALIZATION_MAP)
  3. Medical guidelines automatically handle via RAG

Adding New Medical Domains

  1. Add PDF documents to data/medical_pdfs/
  2. Run python scripts/setup_embeddings.py
  3. Vector store rebuilds automatically
  4. Agents inherit new knowledge through RAG

Custom Analysis Rules

  1. Create new agent in src/agents/
  2. Register in workflow graph (src/workflow.py)
  3. Insert into processing pipeline

Security & Privacy

  • All processing runs locally
  • No personal data sent to APIs (except LLM inference)
  • Vector store derived from public medical PDFs
  • Embeddings computed locally or cached
  • Can operate completely offline after setup

For setup instructions, see QUICKSTART.md For API documentation, see API.md For development guide, see DEVELOPMENT.md