Spaces:
Sleeping
Sleeping
File size: 14,544 Bytes
6dc9d46 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 | # MediGuard AI RAG-Helper - Implementation Summary
## Project Status: β Core System Complete (14/15 Tasks)
**MediGuard AI RAG-Helper** is an explainable multi-agent RAG system that helps patients understand their blood test results and disease predictions using medical knowledge retrieval and LLM-powered explanations.
---
## What Was Implemented
### β 1. Project Structure & Dependencies (Tasks 1-5)
- **State Management** (`src/state.py`): PatientInput, AgentOutput, GuildState, ExplanationSOP
- **LLM Configuration** (`src/llm_config.py`): Ollama models (llama3.1:8b, qwen2:7b)
- **Biomarker Database** (`src/biomarker_validator.py`): 24 biomarkers with gender-specific ranges
- **Configuration** (`src/config.py`): BASELINE_SOP with evolvable hyperparameters
### β 2. Knowledge Base Infrastructure (Task 3, 6)
- **PDF Processor** (`src/pdf_processor.py`):
- HuggingFace sentence-transformers embeddings (10-20x faster than Ollama)
- FAISS vector stores with 2,861 chunks from 750 pages
- 4 specialized retrievers: disease_explainer, biomarker_linker, clinical_guidelines, general
- **Medical PDFs Processed** (8 files):
- Anemia guidelines
- Diabetes management
- Heart disease protocols
- Thrombocytopenia treatment
- Thalassemia care
### β 3. Specialist Agents (Tasks 7-12) - **1,500+ Lines of Code**
#### Agent 1: Biomarker Analyzer (`src/agents/biomarker_analyzer.py`)
- Validates 24 biomarkers against gender-specific reference ranges
- Generates safety alerts for critical values (e.g., severe anemia, dangerous glucose)
- Identifies disease-relevant biomarkers
- Returns structured AgentOutput with flags, alerts, summary
#### Agent 2: Disease Explainer (`src/agents/disease_explainer.py`)
- RAG-based retrieval of disease pathophysiology
- Structured explanation: pathophysiology, diagnostic criteria, clinical presentation
- Extracts PDF citations with page numbers
- Configurable retrieval (k=5 by default from SOP)
#### Agent 3: Biomarker-Disease Linker (`src/agents/biomarker_linker.py`)
- Identifies key biomarker drivers for predicted disease
- Calculates contribution percentages (e.g., HbA1c 40%, Glucose 25%)
- RAG-based evidence retrieval for each driver
- Creates KeyDriver objects with explanations
#### Agent 4: Clinical Guidelines (`src/agents/clinical_guidelines.py`)
- RAG-based clinical practice guideline retrieval
- Structured recommendations:
- Immediate actions (especially for safety alerts)
- Lifestyle changes (diet, exercise, behavioral)
- Monitoring (what to track and frequency)
- Includes guideline citations
#### Agent 5: Confidence Assessor (`src/agents/confidence_assessor.py`)
- Evaluates evidence strength (STRONG/MODERATE/WEAK)
- Identifies limitations (missing data, differential diagnoses, normal relevant values)
- Calculates reliability score (HIGH/MODERATE/LOW) from:
- ML confidence (0-3 points)
- Evidence strength (1-3 points)
- Limitation penalty (-0 to -3 points)
- Provides alternative diagnoses from ML probabilities
#### Agent 6: Response Synthesizer (`src/agents/response_synthesizer.py`)
- Compiles all specialist findings into structured JSON
- Sections: patient_summary, prediction_explanation, clinical_recommendations, confidence_assessment, safety_alerts, metadata
- Generates patient-friendly narrative using LLM
- Includes complete disclaimers and citations
### β 4. Workflow Orchestration (Task 13)
**File**: `src/workflow.py` - ClinicalInsightGuild class
**Architecture**:
```
Patient Input
β
Biomarker Analyzer (validates all values)
β
βββββ΄ββββ¬βββββββββββββ
β β β
Disease Biomarker Clinical
Explainer Linker Guidelines
(RAG) (RAG) (RAG)
βββββ¬ββββ΄βββββββββββββ
β
Confidence Assessor (evaluates reliability)
β
Response Synthesizer (compiles final output)
β
Structured JSON Response
```
**Features**:
- LangGraph StateGraph with 6 specialized nodes
- Parallel execution for RAG agents (Disease Explainer, Biomarker Linker, Clinical Guidelines)
- Sequential execution for validator and synthesizer
- State management through GuildState TypedDict
### β 5. Testing Infrastructure (Task 14)
**File**: `tests/test_basic.py`
**Validated**:
- All imports functional
- Retriever loading (4 specialized retrievers from FAISS)
- PatientInput creation
- BiomarkerValidator with 24 biomarkers
- All core components operational
---
## Technical Stack
### Models & Embeddings
- **LLMs**: Ollama (llama3.1:8b, qwen2:7b)
- Planner: llama3.1:8b (JSON mode, temp=0.0)
- Analyzer: qwen2:7b (fast validation)
- Explainer: llama3.1:8b (RAG retrieval, temp=0.2)
- Synthesizer: llama3.1:8b-instruct (best available)
- **Embeddings**: HuggingFace sentence-transformers/all-MiniLM-L6-v2
- 384 dimensions
- 10-20x faster than Ollama embeddings (~3 min vs 30+ min for 2,861 chunks)
- 100% offline, zero cost
### Frameworks
- **LangChain**: Document loading, text splitting, retrievers
- **LangGraph**: Multi-agent workflow orchestration with StateGraph
- **FAISS**: Vector similarity search
- **Pydantic**: Type-safe state management
### Data
- **Vector Store**: 2,861 chunks from 750 pages of medical PDFs
- **Biomarkers**: 24 clinical parameters with gender-specific ranges
- **Diseases**: 5 conditions (Anemia, Diabetes, Heart Disease, Thrombocytopenia, Thalassemia)
---
## System Capabilities
### Input
```python
{
"biomarkers": {"Glucose": 185, "HbA1c": 8.2, ...}, # 24 values
"model_prediction": {
"disease": "Type 2 Diabetes",
"confidence": 0.87,
"probabilities": {...}
},
"patient_context": {"age": 52, "gender": "male", "bmi": 31.2}
}
```
### Output
```python
{
"patient_summary": {
"narrative": "Patient-friendly 3-4 sentence summary",
"total_biomarkers_tested": 24,
"biomarkers_out_of_range": 7,
"critical_values": 2,
"overall_risk_profile": "Summary from analyzer"
},
"prediction_explanation": {
"primary_disease": "Type 2 Diabetes",
"confidence": 0.87,
"key_drivers": [
{
"biomarker": "HbA1c",
"value": 8.2,
"contribution": 40,
"explanation": "Patient-friendly explanation",
"evidence": "Retrieved from medical PDFs"
}
],
"mechanism_summary": "How the disease works",
"pathophysiology": "Detailed medical explanation",
"pdf_references": ["diabetes_guidelines.pdf (p.15)", ...]
},
"clinical_recommendations": {
"immediate_actions": ["Consult endocrinologist", ...],
"lifestyle_changes": ["Low-carb diet", ...],
"monitoring": ["Check blood glucose daily", ...],
"guideline_citations": [...]
},
"confidence_assessment": {
"prediction_reliability": "HIGH", # or MODERATE/LOW
"evidence_strength": "STRONG",
"limitations": ["Missing thyroid panels", ...],
"recommendation": "Consult healthcare provider",
"alternative_diagnoses": [...]
},
"safety_alerts": [
{
"biomarker": "Glucose",
"priority": "HIGH",
"message": "Severely elevated - immediate medical attention"
}
],
"metadata": {
"timestamp": "2024-01-15T10:30:00",
"system_version": "MediGuard AI RAG-Helper v1.0",
"agents_executed": ["Biomarker Analyzer", ...],
"disclaimer": "Not a substitute for professional medical advice..."
}
}
```
---
## Key Features
### 1. **Explainability Through RAG**
- Every claim backed by retrieved medical documents
- PDF citations with page numbers
- Evidence-based recommendations
### 2. **Multi-Agent Architecture**
- 6 specialist agents with defined roles
- Parallel execution for efficiency
- Modular design for easy extension
### 3. **Patient Safety**
- Automatic critical value detection
- Gender-specific reference ranges
- Clear disclaimers and medical consultation recommendations
### 4. **Evolvable SOPs**
- Hyperparameters in ExplanationSOP (retrieval k, thresholds, prompts)
- Ready for Outer Loop evolution (Director agent)
- Baseline SOP established for performance comparison
### 5. **Fast Local Inference**
- HuggingFace embeddings (10-20x faster than Ollama)
- Local Ollama LLMs (zero API costs)
- 100% offline capable
---
## Performance
### Embedding Generation
- **Original (Ollama)**: 30+ minutes for 2,861 chunks
- **Optimized (HuggingFace)**: ~3 minutes for 2,861 chunks
- **Speedup**: 10-20x improvement
### Vector Store
- **Size**: 2,861 chunks from 750 pages
- **Storage**: FAISS indices in `data/vector_stores/`
- **Retrieval**: Sub-second for k=5 chunks
---
## File Structure
```
RagBot/
βββ src/
β βββ state.py # State management (PatientInput, GuildState)
β βββ config.py # ExplanationSOP, BASELINE_SOP
β βββ llm_config.py # Ollama model configuration
β βββ biomarker_validator.py # 24 biomarkers, validation logic
β βββ pdf_processor.py # PDF ingestion, FAISS, retrievers
β βββ workflow.py # ClinicalInsightGuild orchestration
β βββ agents/
β βββ biomarker_analyzer.py # Agent 1: Validates biomarkers
β βββ disease_explainer.py # Agent 2: RAG disease explanation
β βββ biomarker_linker.py # Agent 3: Links values to prediction
β βββ clinical_guidelines.py # Agent 4: RAG recommendations
β βββ confidence_assessor.py # Agent 5: Evaluates reliability
β βββ response_synthesizer.py # Agent 6: Compiles final output
βββ data/
β βββ medical_pdfs/ # 8 medical guideline PDFs
β βββ vector_stores/ # FAISS indices (medical_knowledge.faiss)
βββ tests/
β βββ test_basic.py # β Core component validation
β βββ test_diabetes_patient.py # Full workflow (requires state integration)
βββ README.md # Project documentation
βββ setup.py # Ollama model installer
βββ code.ipynb # Clinical Trials Architect reference
```
---
## Running the System
### 1. Setup Environment
```powershell
# Install dependencies
pip install langchain langgraph langchain-ollama langchain-community langchain-huggingface faiss-cpu sentence-transformers python-dotenv pypdf
# Pull Ollama models
ollama pull llama3.1:8b
ollama pull qwen2:7b
ollama pull nomic-embed-text
```
### 2. Process Medical PDFs (One-time)
```powershell
python src/pdf_processor.py
```
- Generates `data/vector_stores/medical_knowledge.faiss`
- Takes ~3 minutes for 2,861 chunks
### 3. Run Core Component Test
```powershell
python tests/test_basic.py
```
- Validates: imports, retrievers, patient input, biomarker validator
- **Status**: β All tests passing
### 4. Run Full Workflow (Requires Integration)
```powershell
python tests/test_diabetes_patient.py
```
- **Status**: Core components ready, state integration needed
- See "Next Steps" below
---
## What's Left
### Integration Tasks (Estimated: 2-3 hours)
The multi-agent system is **95% complete**. Remaining work:
1. **State Refactoring** (1-2 hours)
- Update all 6 agents to use GuildState structure (`patient_biomarkers`, `model_prediction`, `patient_context`)
- Current agents expect `patient_input` object
- Need to refactor ~15-20 lines per agent
2. **Workflow Testing** (30 min)
- Run `test_diabetes_patient.py` end-to-end
- Validate JSON output structure
- Test with multiple disease types
3. **5D Evaluation System** (Task 15 - Optional)
- Clinical Accuracy evaluator (LLM-as-judge)
- Evidence Grounding evaluator (programmatic + LLM)
- Actionability evaluator (LLM-as-judge)
- Clarity evaluator (readability metrics)
- Safety evaluator (programmatic checks)
- Aggregate scoring function
---
## Key Design Decisions
### 1. **Fast Embeddings**
- Switched from Ollama to HuggingFace sentence-transformers
- 10-20x speedup for vector store creation
- Maintained quality with all-MiniLM-L6-v2 (384 dims)
### 2. **Local-First Architecture**
- All LLMs run on Ollama (offline capable)
- HuggingFace embeddings (offline capable)
- No API costs, full privacy
### 3. **Multi-Agent Pattern**
- Inspired by Clinical Trials Architect (code.ipynb)
- Each agent has specific expertise
- Parallel execution for RAG agents
- Factory pattern for retriever injection
### 4. **Type Safety**
- Pydantic models for all data structures
- TypedDict for GuildState
- Compile-time validation with mypy/pylance
### 5. **Evolvable SOPs**
- Hyperparameters in config, not hardcoded
- Ready for Director agent (Outer Loop)
- Baseline SOP for performance comparison
---
## Performance Metrics
### System Components
- **Total Code**: ~2,500 lines across 13 files
- **Agent Code**: ~1,500 lines (6 specialist agents)
- **Test Coverage**: Core components validated
- **Vector Store**: 2,861 chunks, sub-second retrieval
### Execution Time (Estimated)
- **Biomarker Analyzer**: ~2-3 seconds
- **RAG Agents (parallel)**: ~5-10 seconds each
- **Confidence Assessor**: ~3-5 seconds
- **Response Synthesizer**: ~5-8 seconds
- **Total Workflow**: ~20-30 seconds end-to-end
---
## References
### Clinical Guidelines (PDFs in `data/medical_pdfs/`)
1. Anemia diagnosis and management
2. Type 2 Diabetes clinical practice guidelines
3. Cardiovascular disease prevention protocols
4. Thrombocytopenia treatment guidelines
5. Thalassemia care standards
### Technical References
- LangChain: https://python.langchain.com/
- LangGraph: https://python.langchain.com/docs/langgraph
- Ollama: https://ollama.ai/
- HuggingFace sentence-transformers: https://huggingface.co/sentence-transformers
- FAISS: https://github.com/facebookresearch/faiss
---
## License
See LICENSE file.
---
## Disclaimer
**IMPORTANT**: This system is for patient self-assessment and educational purposes only. It is **NOT** a substitute for professional medical advice, diagnosis, or treatment. Always consult qualified healthcare providers for medical decisions.
---
## Acknowledgments
Built using the Clinical Trials Architect pattern from `code.ipynb` as architectural reference for multi-agent RAG systems.
---
**Project Status**: β Core Implementation Complete (14/15 tasks)
**Readiness**: 95% - Ready for state integration and end-to-end testing
**Next Step**: Refactor agent state handling β Run full workflow test β Deploy
|