Agentic-RagBot / docs /ARCHITECTURE.md
Nikhil Pravin Pise
docs: update all documentation to reflect current codebase state
aefac4f
# RagBot System Architecture
## Overview
RagBot is a Multi-Agent RAG (Retrieval-Augmented Generation) system for medical biomarker analysis. It combines large language models with a specialized medical knowledge base to provide evidence-based insights on patient biomarker readings.
## System Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ User Interfaces β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ CLI Chat β”‚ β”‚ REST API β”‚ β”‚ Web UI β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Workflow Orchestrator β”‚
β”‚ (LangGraph) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Extraction β”‚ β”‚ Analysis β”‚ β”‚ Knowledge β”‚
β”‚ Agent β”‚ β”‚ Agents β”‚ β”‚ Retrieval β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LLM Provider β”‚
β”‚ (Groq - LLaMA 3.3-70B) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Medical Knowledge Base β”‚
β”‚ (FAISS Vector Store) β”‚
β”‚ (750 pages, 2,609 docs) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Core Components
### 1. **Biomarker Extraction & Validation** (`src/biomarker_validator.py`, `src/biomarker_normalization.py`)
- Parses user input for blood test results
- Normalizes biomarker names via 80+ alias mappings to 24 canonical names
- Validates values against established reference ranges (with clinically appropriate critical thresholds)
- Generates safety alerts for critical values
- Flags all out-of-range values (no suppression threshold)
### 2. **Multi-Agent Workflow** (`src/workflow.py` using LangGraph)
The system processes each patient case through 6 specialist agents:
#### Agent 1: Biomarker Analyzer
- Validates each biomarker against reference ranges
- Identifies out-of-range values
- Generates immediate clinical alerts
- Predicts disease relevance (baseline diagnostic)
#### Agent 2: Disease Explainer (RAG)
- Retrieves medical literature on predicted disease
- Explains pathophysiological mechanisms
- Provides evidence-based disease context
- Sources: medical PDFs (anemia, diabetes, heart disease, thrombocytopenia)
#### Agent 3: Biomarker-Disease Linker (RAG)
- Maps patient biomarkers to disease indicators
- Identifies key drivers of the predicted condition
- Retrieves lab-specific guidelines
- Explains biomarker significance in disease context
#### Agent 4: Clinical Guidelines Agent (RAG)
- Retrieves evidence-based clinical guidelines
- Provides immediate recommendations
- Suggests monitoring parameters
- Offers lifestyle and medication guidance
#### Agent 5: Confidence Assessor
- Evaluates prediction reliability
- Assesses evidence strength
- Identifies limitations in analysis
- Provides confidence score with reasoning
#### Agent 6: Response Synthesizer
- Consolidates findings from all agents
- Generates comprehensive patient summary
- Produces actionable recommendations
- Creates structured final report
### 3. **Knowledge Base** (`src/pdf_processor.py`)
- **Source**: 8 medical PDF documents (750 pages total)
- **Storage**: FAISS vector database (2,609 document chunks)
- **Embeddings**: Google Gemini (default, free) or HuggingFace sentence-transformers (local, offline)
- **Format**: Chunked with 1000 char overlap for context preservation
### 4. **LLM Configuration** (`src/llm_config.py`)
- **Primary LLM**: Groq LLaMA 3.3-70B (fast, free)
- **Alternative LLM**: Google Gemini 2.0 Flash (free)
- **Local LLM**: Ollama (for offline use)
- Fast inference (~1-2 sec per agent output)
- Free API tier available
- No rate limiting for reasonable usage
- **Embedding Model**: HuggingFace sentence-transformers/all-MiniLM-L6-v2
- 384-dimensional embeddings
- Fast similarity search
- Runs locally (no API dependency)
## Data Flow
```
User Input
↓
[Extraction] β†’ Normalized Biomarkers
↓
[Prediction] β†’ Disease Hypothesis (85% confidence)
↓
[RAG Retrieval] β†’ Medical Literature (5-10 relevant docs)
↓
[Analysis] β†’ All 6 Agents Process in Parallel
↓
[Synthesis] β†’ Comprehensive Report
↓
[Output] β†’ Recommendations + Safety Alerts + Evidence
```
## Key Design Decisions
1. **Cloud Embeddings**: Google Gemini embeddings (free tier) with HuggingFace fallback for offline use
2. **Groq LLM**: Free, fast inference for real-time interaction
3. **Multiple Providers**: Support for Groq, Google Gemini, and Ollama (local)
4. **LangGraph**: Manages complex multi-agent workflows with state management
5. **FAISS**: Efficient similarity search on large medical document collection
6. **Modular Agents**: Each agent has clear responsibility, enabling parallel execution
7. **RAG Integration**: Medical knowledge grounds responses in evidence
8. **Biomarker Normalization**: 80+ aliases ensure robust input handling
## Technologies Used
| Component | Technology | Purpose |
|-----------|-----------|---------|
| Orchestration | LangGraph | Workflow management |
| LLM | Groq API / Google Gemini | Fast inference |
| Embeddings | Google Gemini / HuggingFace | Vector representations |
| Vector DB | FAISS | Similarity search |
| Data Validation | Pydantic V2 | Type safety & schemas |
| REST API | FastAPI | Web interface |
## Performance Characteristics
- **Response Time**: 15-25 seconds (6 agents + RAG retrieval)
- **Knowledge Base Size**: 750 pages, 2,609 chunks
- **Embedding Dimensions**: 384
- **Inference Cost**: Free (local embeddings + Groq free tier)
- **Scalability**: Easily extends to more medical domains
## Extensibility
### Adding New Biomarkers
1. Update `config/biomarker_references.json` with reference ranges
2. Add aliases to `src/biomarker_normalization.py` (NORMALIZATION_MAP)
3. Medical guidelines automatically handle via RAG
### Adding New Medical Domains
1. Add PDF documents to `data/medical_pdfs/`
2. Run `python scripts/setup_embeddings.py`
3. Vector store rebuilds automatically
4. Agents inherit new knowledge through RAG
### Custom Analysis Rules
1. Create new agent in `src/agents/`
2. Register in workflow graph (`src/workflow.py`)
3. Insert into processing pipeline
## Security & Privacy
- All processing runs locally
- No personal data sent to APIs (except LLM inference)
- Vector store derived from public medical PDFs
- Embeddings computed locally or cached
- Can operate completely offline after setup
---
For setup instructions, see [QUICKSTART.md](../QUICKSTART.md)
For API documentation, see [API.md](API.md)
For development guide, see [DEVELOPMENT.md](DEVELOPMENT.md)