Spaces:
Running
Running
RagBot System Architecture
Overview
RagBot is a Multi-Agent RAG (Retrieval-Augmented Generation) system for medical biomarker analysis. It combines large language models with a specialized medical knowledge base to provide evidence-based insights on patient biomarker readings.
System Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Interfaces β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β CLI Chat β β REST API β β Web UI β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
βββββββββββΌβββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββ
β β β
ββββββββββββββββββββΌβββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β Workflow Orchestrator β
β (LangGraph) β
ββββββββββββββββ¬ββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Extraction β β Analysis β β Knowledge β
β Agent β β Agents β β Retrieval β
βββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β β β
ββββββββββββββββββββΌβββββββββββββββββββ
β
ββββββββββββββββΌβββββββββββββββ
β LLM Provider β
β (Groq - LLaMA 3.3-70B) β
ββββββββββββββββ¬ββββββββββββββββ
β
ββββββββββββββββΌβββββββββββββββ
β Medical Knowledge Base β
β (FAISS Vector Store) β
β (750 pages, 2,609 docs) β
βββββββββββββββββββββββββββββββ
Core Components
1. Biomarker Extraction & Validation (src/biomarker_validator.py, src/biomarker_normalization.py)
- Parses user input for blood test results
- Normalizes biomarker names via 80+ alias mappings to 24 canonical names
- Validates values against established reference ranges (with clinically appropriate critical thresholds)
- Generates safety alerts for critical values
- Flags all out-of-range values (no suppression threshold)
2. Multi-Agent Workflow (src/workflow.py using LangGraph)
The system processes each patient case through 6 specialist agents:
Agent 1: Biomarker Analyzer
- Validates each biomarker against reference ranges
- Identifies out-of-range values
- Generates immediate clinical alerts
- Predicts disease relevance (baseline diagnostic)
Agent 2: Disease Explainer (RAG)
- Retrieves medical literature on predicted disease
- Explains pathophysiological mechanisms
- Provides evidence-based disease context
- Sources: medical PDFs (anemia, diabetes, heart disease, thrombocytopenia)
Agent 3: Biomarker-Disease Linker (RAG)
- Maps patient biomarkers to disease indicators
- Identifies key drivers of the predicted condition
- Retrieves lab-specific guidelines
- Explains biomarker significance in disease context
Agent 4: Clinical Guidelines Agent (RAG)
- Retrieves evidence-based clinical guidelines
- Provides immediate recommendations
- Suggests monitoring parameters
- Offers lifestyle and medication guidance
Agent 5: Confidence Assessor
- Evaluates prediction reliability
- Assesses evidence strength
- Identifies limitations in analysis
- Provides confidence score with reasoning
Agent 6: Response Synthesizer
- Consolidates findings from all agents
- Generates comprehensive patient summary
- Produces actionable recommendations
- Creates structured final report
3. Knowledge Base (src/pdf_processor.py)
- Source: 8 medical PDF documents (750 pages total)
- Storage: FAISS vector database (2,609 document chunks)
- Embeddings: Google Gemini (default, free) or HuggingFace sentence-transformers (local, offline)
- Format: Chunked with 1000 char overlap for context preservation
4. LLM Configuration (src/llm_config.py)
- Primary LLM: Groq LLaMA 3.3-70B (fast, free)
- Alternative LLM: Google Gemini 2.0 Flash (free)
- Local LLM: Ollama (for offline use)
- Fast inference (~1-2 sec per agent output)
- Free API tier available
- No rate limiting for reasonable usage
- Embedding Model: HuggingFace sentence-transformers/all-MiniLM-L6-v2
- 384-dimensional embeddings
- Fast similarity search
- Runs locally (no API dependency)
Data Flow
User Input
β
[Extraction] β Normalized Biomarkers
β
[Prediction] β Disease Hypothesis (85% confidence)
β
[RAG Retrieval] β Medical Literature (5-10 relevant docs)
β
[Analysis] β All 6 Agents Process in Parallel
β
[Synthesis] β Comprehensive Report
β
[Output] β Recommendations + Safety Alerts + Evidence
Key Design Decisions
- Cloud Embeddings: Google Gemini embeddings (free tier) with HuggingFace fallback for offline use
- Groq LLM: Free, fast inference for real-time interaction
- Multiple Providers: Support for Groq, Google Gemini, and Ollama (local)
- LangGraph: Manages complex multi-agent workflows with state management
- FAISS: Efficient similarity search on large medical document collection
- Modular Agents: Each agent has clear responsibility, enabling parallel execution
- RAG Integration: Medical knowledge grounds responses in evidence
- Biomarker Normalization: 80+ aliases ensure robust input handling
Technologies Used
| Component | Technology | Purpose |
|---|---|---|
| Orchestration | LangGraph | Workflow management |
| LLM | Groq API / Google Gemini | Fast inference |
| Embeddings | Google Gemini / HuggingFace | Vector representations |
| Vector DB | FAISS | Similarity search |
| Data Validation | Pydantic V2 | Type safety & schemas |
| REST API | FastAPI | Web interface |
Performance Characteristics
- Response Time: 15-25 seconds (6 agents + RAG retrieval)
- Knowledge Base Size: 750 pages, 2,609 chunks
- Embedding Dimensions: 384
- Inference Cost: Free (local embeddings + Groq free tier)
- Scalability: Easily extends to more medical domains
Extensibility
Adding New Biomarkers
- Update
config/biomarker_references.jsonwith reference ranges - Add aliases to
src/biomarker_normalization.py(NORMALIZATION_MAP) - Medical guidelines automatically handle via RAG
Adding New Medical Domains
- Add PDF documents to
data/medical_pdfs/ - Run
python scripts/setup_embeddings.py - Vector store rebuilds automatically
- Agents inherit new knowledge through RAG
Custom Analysis Rules
- Create new agent in
src/agents/ - Register in workflow graph (
src/workflow.py) - Insert into processing pipeline
Security & Privacy
- All processing runs locally
- No personal data sent to APIs (except LLM inference)
- Vector store derived from public medical PDFs
- Embeddings computed locally or cached
- Can operate completely offline after setup
For setup instructions, see QUICKSTART.md For API documentation, see API.md For development guide, see DEVELOPMENT.md