Spaces:
Running
MediGuard AI RAG-Helper - Project Context
π― Project Overview
MediGuard AI RAG-Helper is a self-improving multi-agent RAG system that provides explainable clinical predictions for patient self-assessment. The system takes raw blood test biomarker values and a disease prediction from a pre-trained ML model, then generates comprehensive, evidence-backed explanations using medical literature.
π System Scope
Diseases Covered (5 conditions)
- Anemia
- Diabetes
- Thrombocytopenia
- Thalassemia
- Heart Disease
Input Biomarkers (24 clinical parameters)
- Glucose
- Cholesterol
- Hemoglobin
- Platelets
- White Blood Cells
- Red Blood Cells
- Hematocrit
- Mean Corpuscular Volume (MCV)
- Mean Corpuscular Hemoglobin (MCH)
- Mean Corpuscular Hemoglobin Concentration (MCHC)
- Insulin
- BMI
- Systolic Blood Pressure
- Diastolic Blood Pressure
- Triglycerides
- HbA1c
- LDL Cholesterol
- HDL Cholesterol
- ALT (Alanine Aminotransferase)
- AST (Aspartate Aminotransferase)
- Heart Rate
- Creatinine
- Troponin
- C-reactive Protein
Biomarker Reference Ranges
| Biomarker | Normal Range (Adults) | Unit | Critical Values |
|---|---|---|---|
| Glucose (Fasting) | 70-100 | mg/dL | <70 (hypoglycemia), >126 (diabetes) |
| Cholesterol (Total) | <200 | mg/dL | >240 (high risk) |
| Hemoglobin | M: 13.5-17.5, F: 12.0-15.5 | g/dL | <7 (severe anemia), >18 (polycythemia) |
| Platelets | 150,000-400,000 | cells/ΞΌL | <50,000 (critical), >1,000,000 (thrombocytosis) |
| White Blood Cells | 4,000-11,000 | cells/ΞΌL | <2,000 (critical), >30,000 (leukemia risk) |
| Red Blood Cells | M: 4.5-5.9, F: 4.0-5.2 | million/ΞΌL | <3.0 (severe anemia) |
| Hematocrit | M: 38.8-50.0, F: 34.9-44.5 | % | <25 (severe anemia), >60 (polycythemia) |
| MCV | 80-100 | fL | <80 (microcytic), >100 (macrocytic) |
| MCH | 27-33 | pg | <27 (hypochromic) |
| MCHC | 32-36 | g/dL | <32 (hypochromic) |
| Insulin (Fasting) | 2.6-24.9 | ΞΌIU/mL | >25 (insulin resistance) |
| BMI | 18.5-24.9 | kg/mΒ² | <18.5 (underweight), >30 (obese) |
| Systolic BP | 90-120 | mmHg | <90 (hypotension), >140 (hypertension) |
| Diastolic BP | 60-80 | mmHg | <60 (hypotension), >90 (hypertension) |
| Triglycerides | <150 | mg/dL | >500 (pancreatitis risk) |
| HbA1c | <5.7 | % | 5.7-6.4 (prediabetes), β₯6.5 (diabetes) |
| LDL Cholesterol | <100 | mg/dL | >190 (very high risk) |
| HDL Cholesterol | M: >40, F: >50 | mg/dL | <40 (cardiac risk) |
| ALT | 7-56 | U/L | >200 (liver damage) |
| AST | 10-40 | U/L | >200 (liver/heart damage) |
| Heart Rate | 60-100 | bpm | <50 (bradycardia), >120 (tachycardia) |
| Creatinine | M: 0.7-1.3, F: 0.6-1.1 | mg/dL | >3.0 (kidney failure) |
| Troponin | <0.04 | ng/mL | >0.04 (myocardial injury) |
| C-reactive Protein | <3.0 | mg/L | >10 (acute inflammation) |
ποΈ System Architecture
Two-Loop Design (Adapted from Clinical Trials Architect)
INNER LOOP: Clinical Insight Guild
Multi-agent RAG pipeline that generates explainable clinical reports.
Agents:
- Planner Agent - Creates task execution plan
- Biomarker Analyzer Agent - Validates values against reference ranges, flags anomalies
- Disease Explainer Agent - Retrieves disease pathophysiology from medical PDFs
- Biomarker-Disease Linker Agent - Connects specific biomarker values to predicted disease
- Clinical Guidelines Agent - Retrieves evidence-based recommendations from PDFs
- Confidence Assessor Agent - Evaluates prediction reliability and evidence strength
- Response Synthesizer Agent - Compiles structured JSON output
OUTER LOOP: Clinical Explanation Director
Meta-learning system that improves explanation quality over time.
Components:
- Performance Diagnostician - Analyzes which dimensions need improvement
- SOP Architect - Evolves explanation strategies (prompts, retrieval params, agent configs)
- Gene Pool - Tracks all SOP versions and their performance
π Knowledge Infrastructure
Data Sources
Medical PDF Documents (User-provided)
- Disease-specific medical literature
- Clinical guidelines
- Biomarker interpretation guides
- Treatment protocols
Biomarker Reference Database (Structured)
- Normal ranges by age/gender
- Critical value thresholds
- Unit conversions
- Clinical significance flags
Disease-Biomarker Associations (Derived from PDFs)
- Which biomarkers are diagnostic for each disease
- Pathophysiological mechanisms
- Differential diagnosis criteria
Storage & Indexing
| Data Type | Storage | Access Method |
|---|---|---|
| Medical PDFs | FAISS Vector Store | Semantic search (embeddings) |
| Reference Ranges | DuckDB/JSON | SQL queries / Dict lookup |
| Disease Mappings | Python Dict/JSON | Key-value retrieval |
π Workflow
Patient Input
{
"biomarkers": {
"glucose": 185,
"hba1c": 8.2,
"hemoglobin": 11.5,
"platelets": 220000,
// ... all 24 biomarkers
},
"model_prediction": {
"disease": "Diabetes",
"confidence": 0.89,
"probabilities": {
"Diabetes": 0.89,
"Heart Disease": 0.06,
"Anemia": 0.03,
"Thalassemia": 0.01,
"Thrombocytopenia": 0.01
}
}
}
System Processing
- Biomarker Validation - Check all values against reference ranges
- RAG Retrieval - Query PDFs for disease mechanism + biomarker significance
- Explanation Generation - Link biomarkers to prediction with evidence
- Safety Checks - Flag critical values, missing data, low confidence
- Recommendation Synthesis - Provide actionable next steps from guidelines
Output Structure
{
"patient_summary": {
"biomarker_flags": [...], // Out-of-range values with warnings
"overall_risk_profile": "High metabolic risk"
},
"prediction_explanation": {
"primary_disease": "Diabetes",
"confidence": 0.89,
"key_drivers": [
{
"biomarker": "HbA1c",
"value": 8.2,
"contribution": "45%",
"explanation": "HbA1c of 8.2% indicates poor glycemic control...",
"evidence": "ADA Guidelines 2024, Section 2.3: 'HbA1c β₯6.5% diagnostic'"
}
],
"mechanism_summary": "Type 2 Diabetes results from insulin resistance...",
"pdf_references": ["diabetes_pathophysiology.pdf p.15", ...]
},
"clinical_recommendations": {
"immediate_actions": ["Repeat fasting glucose", "Consult physician"],
"lifestyle_changes": ["Reduce sugar intake", "Exercise 30min daily"],
"monitoring": ["Check HbA1c every 3 months"],
"guideline_citations": ["ADA Standards of Care 2024"]
},
"confidence_assessment": {
"prediction_reliability": "HIGH",
"evidence_strength": "STRONG",
"limitations": ["Missing lipid panel data"],
"recommendation": "High confidence diagnosis; seek medical consultation"
},
"safety_alerts": [
{
"severity": "HIGH",
"biomarker": "Glucose",
"message": "Fasting glucose 185 mg/dL significantly elevated",
"action": "Urgent physician consultation recommended"
}
]
}
π― Multi-Dimensional Evaluation (5D Quality Metrics)
The Outer Loop evaluates explanation quality across five dimensions:
Clinical Accuracy (LLM-as-Judge)
- Are biomarker interpretations medically correct?
- Is the disease mechanism explanation accurate?
Evidence Grounding (Programmatic + LLM)
- Are all claims backed by PDF citations?
- Are citations verifiable and accurate?
Clinical Actionability (LLM-as-Judge)
- Are recommendations safe and appropriate?
- Are next steps clear and guideline-aligned?
Explainability Clarity (Programmatic)
- Is language accessible for patient self-assessment?
- Are biomarker values clearly explained?
- Readability score check
Safety & Completeness (Programmatic)
- Are all out-of-range values flagged?
- Are critical alerts present?
- Are uncertainties acknowledged?
𧬠Evolvable Configuration (ExplanationSOP)
The system's behavior is controlled by a dynamic configuration that evolves:
class ExplanationSOP(BaseModel):
# Agent parameters
biomarker_analyzer_threshold: float = 0.15 # % deviation to flag
disease_explainer_k: int = 5 # Top-k PDF chunks
linker_feature_importance: bool = True
# Prompts (evolvable)
synthesizer_prompt: str = "Synthesize in patient-friendly language..."
explainer_detail_level: Literal["concise", "detailed"] = "detailed"
# Feature flags
use_guideline_agent: bool = True
include_alternative_diagnoses: bool = True
require_pdf_citations: bool = True
# Safety settings
critical_value_alert_mode: Literal["strict", "moderate"] = "strict"
The Director Agent automatically tunes these parameters based on performance feedback.
π οΈ Technology Stack
LLM Configuration
- Fast Agents (Analyzer, Planner): Qwen2:7B or Llama-3.1:8B
- RAG Agents (Explainer, Guidelines): Llama-3.1:8B
- Synthesizer: Llama-3.1:8B (upgradeable to 70B)
- Director (Outer Loop): Llama-3:70B
- Embeddings: nomic-embed-text or bio-clinical-bert
Infrastructure
- Framework: LangChain + LangGraph (state-based orchestration)
- Vector Store: FAISS (medical PDF chunks)
- Structured Data: DuckDB or JSON (reference ranges)
- Document Processing: pypdf, layout-parser
- Observability: LangSmith (agent tracing)
π Development Phases
Phase 1: Core System (Current Focus)
- Set up project structure
- Ingest user-provided medical PDFs
- Build biomarker reference range database
- Implement Inner Loop agents
- Create LangGraph workflow
- Test with sample patient data
Phase 2: Evaluation System
- Define 5D evaluation metrics
- Implement LLM-as-judge evaluators
- Build safety checkers
- Test on diverse disease cases
Phase 3: Self-Improvement (Outer Loop)
- Implement Performance Diagnostician
- Build SOP Architect
- Set up evolution cycle
- Track SOP gene pool
Phase 4: Refinement
- Tune explanation quality
- Optimize PDF retrieval
- Add edge case handling
- Patient-friendly language review
π Use Case: Patient Self-Assessment
Target User: Individual with blood test results seeking to understand their health status before or between doctor visits.
Key Features for Self-Assessment:
- π¨ Safety-first: Clear warnings for critical values ("Seek immediate medical attention")
- π Educational: Explain what each biomarker means in simple terms
- π Evidence-backed: Citations from medical literature build trust
- π― Actionable: Suggest lifestyle changes, when to see a doctor
- β οΈ Uncertainty transparency: Clearly state when predictions are low-confidence
Disclaimer: System emphasizes it is NOT a replacement for professional medical advice.
π Current Status
What's Built: Base architecture understanding from Clinical Trials system
What's Next:
- Create project structure
- Collect and process medical PDFs
- Implement biomarker validation
- Build specialist agents
- Set up RAG retrieval pipeline
External ML Model: Pre-trained disease prediction model (handled separately)
- Input: 24 biomarkers
- Output: Disease label + confidence scores for 5 diseases
π Important Notes
- Medical Disclaimer: This is a self-assessment tool, not a diagnostic device
- Data Privacy: All processing happens locally (if using local LLMs)
- Evidence Quality: System quality depends on medical PDF content provided
- Evolving System: Explanation strategies improve automatically over time
- Human Oversight: Critical decisions should always involve healthcare professionals
Last Updated: November 22, 2025 Project: MediGuard AI RAG-Helper Repository: RagBot