Agentic-RagBot / docs /plans /PRODUCTION_UPGRADE_PLAN.md
Nikhil Pravin Pise
Refactor: Improve code quality, security, and configuration
ad2e847
# MediGuard AI β€” Production Upgrade Plan
## From Prototype to Production-Grade MedTech RAG System
> **Generated**: 2026-02-23
> **Based on**: Deep review of production-agentic-rag-course (Weeks 1–7) + existing RagBot codebase
> **Goal**: Take the existing MediGuard AI (clinical biomarker analysis + RAG explanation system) to full production quality, applying every lesson from the arXiv Paper Curator course β€” adapted for the MedTech domain.
---
## Table of Contents
1. [Executive Summary](#1-executive-summary)
2. [Deep Review: Course vs. Your Codebase](#2-deep-review-course-vs-your-codebase)
3. [Architecture Gap Analysis](#3-architecture-gap-analysis)
4. [Phase 1: Infrastructure Foundation](#phase-1-infrastructure-foundation-week-1-equivalent)
5. [Phase 2: Medical Data Ingestion Pipeline](#phase-2-medical-data-ingestion-pipeline-week-2-equivalent)
6. [Phase 3: Production Search Foundation](#phase-3-production-search-foundation-week-3-equivalent)
7. [Phase 4: Hybrid Search & Intelligent Chunking](#phase-4-hybrid-search--intelligent-chunking-week-4-equivalent)
8. [Phase 5: Complete RAG Pipeline with Streaming](#phase-5-complete-rag-pipeline-with-streaming-week-5-equivalent)
9. [Phase 6: Monitoring, Caching & Observability](#phase-6-monitoring-caching--observability-week-6-equivalent)
10. [Phase 7: Agentic RAG & Messaging Bot](#phase-7-agentic-rag--messaging-bot-week-7-equivalent)
11. [Phase 8: MedTech-Specific Additions](#phase-8-medtech-specific-additions-beyond-course)
12. [Implementation Priority Matrix](#implementation-priority-matrix)
13. [Migration Strategy](#migration-strategy)
---
## 1. Executive Summary
Your RagBot is a **working prototype** with strong domain logic (biomarker validation, multi-agent clinical analysis, 5D evaluation, SOP evolution). The course teaches **production infrastructure** (Docker orchestration, OpenSearch hybrid search, Airflow pipelines, Redis caching, Langfuse observability, LangGraph agentic workflows, Telegram bot).
**The strategy**: Keep your excellent medical domain logic and multi-agent architecture, but rebuild the infrastructure layer to match production standards. Your domain is *harder* than arXiv papers β€” medical data demands stricter validation, HIPAA-aware patterns, and safety guardrails.
### What You Have (Strengths)
- βœ… 6 specialized medical agents (Biomarker Analyzer, Disease Explainer, Biomarker-Disease Linker, Clinical Guidelines, Confidence Assessor, Response Synthesizer)
- βœ… LangGraph orchestration with parallel execution
- βœ… Robust biomarker validation with 24 biomarkers, reference ranges, critical values
- βœ… 5D evaluation framework (Clinical Accuracy, Evidence Grounding, Actionability, Clarity, Safety)
- βœ… SOP evolution engine (Outer Loop optimization)
- βœ… Multi-provider LLM support (Groq, Gemini, Ollama)
- βœ… Basic FastAPI with analysis endpoints
- βœ… CLI chatbot with natural language biomarker extraction
### What You're Missing (Gaps)
- ❌ No Docker Compose orchestration (only minimal single-service Dockerfile)
- ❌ No production database (PostgreSQL) β€” no patient/report persistence
- ❌ No production search engine β€” using FAISS (in-memory, single-file, no filtering)
- ❌ No chunking strategy β€” basic RecursiveCharacterTextSplitter only
- ❌ No hybrid search (BM25 + vector) β€” vector-only retrieval
- ❌ No production embeddings β€” using local HuggingFace MiniLM (384d) or Google free tier
- ❌ No data ingestion pipeline (Airflow) β€” manual PDF loading
- ❌ No caching layer (Redis) β€” every query hits LLM
- ❌ No observability (Langfuse) β€” no tracing, no cost tracking
- ❌ No streaming responses β€” synchronous only
- ❌ No Gradio interface β€” CLI only (besides basic API)
- ❌ No messaging bot (Telegram/WhatsApp) β€” no mobile access
- ❌ No agentic RAG with guardrails, document grading, query rewriting
- ❌ No proper dependency injection pattern (FastAPI `Depends()`)
- ❌ No Pydantic Settings with env-nested config
- ❌ No factory pattern for service initialization
- ❌ No proper exception hierarchy
- ❌ No health checks for all services
- ❌ No Makefile / dev tooling (ruff, mypy, pre-commit)
- ❌ No proper test infrastructure (pytest fixtures, test containers)
---
## 2. Deep Review: Course vs. Your Codebase
### Course Architecture (What Production Looks Like)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Docker Compose Orchestration β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ FastAPI β”‚PostgreSQLβ”‚OpenSearchβ”‚ Ollama β”‚ Airflow β”‚
β”‚ (8000) β”‚ (5432) β”‚ (9200) β”‚ (11434) β”‚ (8080) β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Redis β”‚ Langfuse β”‚ClickHouseβ”‚ MinIO β”‚ Langfuse-PG β”‚
β”‚ (6379) β”‚ (3001) β”‚ β”‚ β”‚ (5433) β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Gradio UI (7861) β”‚ Telegram Bot β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
**Key Patterns from Course:**
- **Pydantic Settings** with `env_nested_delimiter="__"` for hierarchical config
- **Factory pattern** (`make_*` functions) for every service
- **Dependency injection** via FastAPI `Depends()` with typed annotations
- **Lifespan context** for startup/shutdown with proper resource management
- **Service layer separation**: `routers/` β†’ `services/` β†’ `clients/`
- **Schema-driven**: Separate Pydantic schemas for API, database, embeddings, indexing
- **Exception hierarchy**: Domain-specific exceptions (`PDFParsingException`, `OllamaException`, etc.)
- **Context dataclass** for LangGraph runtime dependency injection
- **Structured LLM output** via `.with_structured_output(PydanticModel)`
### Your Codebase Architecture (Current State)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Basic FastAPI (api/app/) β”‚
β”‚ Single Dockerfile, no orchestration β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ src/ (Core Domain Logic) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ workflow.py (LangGraph StateGraph) β”‚ β”‚
β”‚ β”‚ 6 agents/ (parallel execution) β”‚ β”‚
β”‚ β”‚ biomarker_validator.py (24 markers) β”‚ β”‚
β”‚ β”‚ pdf_processor.py (FAISS + PyPDF) β”‚ β”‚
β”‚ β”‚ evaluation/ (5D framework) β”‚ β”‚
β”‚ β”‚ evolution/ (SOP optimization) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ FAISS vector store (single file) β”‚
β”‚ No PostgreSQL, No Redis, No OpenSearch β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## 3. Architecture Gap Analysis
| Dimension | Course (Production) | Your Codebase (Prototype) | Gap Severity |
|-----------|-------------------|--------------------------|--------------|
| **Container Orchestration** | Docker Compose with 12+ services, health checks, networks | Single Dockerfile, manual startup | πŸ”΄ Critical |
| **Database** | PostgreSQL 16 with SQLAlchemy models, repositories | None (in-memory only) | πŸ”΄ Critical |
| **Search Engine** | OpenSearch 2.19 with BM25 + KNN hybrid, RRF fusion | FAISS (vector-only, no filtering) | πŸ”΄ Critical |
| **Chunking** | Section-aware chunking (600w, 100w overlap, metadata) | Basic RecursiveCharacterTextSplitter (1000 char) | 🟑 Major |
| **Embeddings** | Jina AI v3 (1024d, passage/query differentiation) | HuggingFace MiniLM (384d) or Google free tier | 🟑 Major |
| **Data Pipeline** | Airflow DAGs (daily schedule, fetchβ†’parseβ†’chunkβ†’index) | Manual PDF loading, one-time setup | 🟑 Major |
| **Caching** | Redis with TTL, exact-match, SHA256 keys | None | 🟑 Major |
| **Observability** | Langfuse v3 (traces, spans, generations, cost tracking) | None (print statements only) | 🟑 Major |
| **Streaming** | SSE streaming with Gradio UI | None (synchronous responses) | 🟑 Major |
| **Agentic RAG** | LangGraph with guardrails, grading, rewriting, context_schema | Basic LangGraph (no guardrails, no grading) | 🟑 Major |
| **Bot Integration** | Telegram bot with /search, Q&A, caching | None | 🟒 Enhancement |
| **Config Management** | Pydantic Settings, hierarchical env vars, frozen models | Basic os.getenv, dotenv | 🟑 Major |
| **Dependency Injection** | FastAPI Depends() with typed annotations | Manual global singletons | 🟑 Major |
| **Error Handling** | Domain exception hierarchy, graceful fallbacks | Basic try/except with prints | 🟑 Major |
| **Code Quality** | Ruff, MyPy, pre-commit, pytest with fixtures | Minimal pytest, no linting | 🟒 Enhancement |
| **API Design** | Versioned (/api/v1/), health checks for all services | Basic routes, minimal health check | 🟑 Major |
---
## Phase 1: Infrastructure Foundation (Week 1 Equivalent)
> **Goal**: Containerize everything, add PostgreSQL for persistence, set up OpenSearch, establish professional development environment.
### 1.1 Docker Compose Orchestration
Create a production `docker-compose.yml` with all services:
```yaml
# Target services for MediGuard AI:
services:
api: # FastAPI application (port 8000)
postgres: # Patient reports, analysis history (port 5432)
opensearch: # Medical document search engine (port 9200)
opensearch-dashboards: # Search UI (port 5601)
redis: # Response caching (port 6379)
ollama: # Local LLM for privacy-sensitive medical data (port 11434)
airflow: # Medical literature pipeline (port 8080)
langfuse-web: # Observability dashboard (port 3001)
langfuse-worker/postgres/redis/clickhouse/minio: # Langfuse infra
```
**Tasks:**
- [ ] Create root `docker-compose.yml` adapting course pattern to MedTech services
- [ ] Create multi-stage `Dockerfile` using UV package manager (copy course pattern)
- [ ] Add health checks for every service (PostgreSQL, OpenSearch, Redis, Ollama)
- [ ] Set up Docker network `mediguard-network` with proper service dependencies
- [ ] Configure volume persistence for all data stores
- [ ] Create `.env.example` with all configuration variables documented
### 1.2 Pydantic Settings Configuration
Replace scattered `os.getenv()` calls with hierarchical Pydantic Settings:
```python
# New: src/config.py (course-inspired)
class MedicalPDFSettings(BaseConfigSettings): # PDF parser config
class ChunkingSettings(BaseConfigSettings): # Chunking parameters
class OpenSearchSettings(BaseConfigSettings): # Search engine config
class LangfuseSettings(BaseConfigSettings): # Observability config
class RedisSettings(BaseConfigSettings): # Cache config
class TelegramSettings(BaseConfigSettings): # Bot config
class BiomarkerSettings(BaseConfigSettings): # Biomarker thresholds
class Settings(BaseConfigSettings): # Root settings
```
**Tasks:**
- [ ] Rewrite `src/config.py` β€” keep `ExplanationSOP` but add infrastructure settings classes
- [ ] Use `env_nested_delimiter="__"` for hierarchical environment variables
- [ ] Add `frozen=True` for immutable configuration
- [ ] Move all hardcoded values to environment variables with sensible defaults
- [ ] Create `get_settings()` factory with `@lru_cache`
### 1.3 PostgreSQL Database Setup
Add persistent storage for analysis history β€” critical for medical audit trail:
```python
# New models:
class PatientAnalysis(Base): # Store each analysis run
class AnalysisReport(Base): # Store final reports
class MedicalDocument(Base): # Track ingested medical PDFs
class BiomarkerReference(Base): # Biomarker reference ranges (currently JSON file)
```
**Tasks:**
- [ ] Create `src/db/` package mirroring course pattern (factory, interfaces, postgresql)
- [ ] Define SQLAlchemy models for analysis history and medical documents
- [ ] Create repository pattern for data access
- [ ] Set up Alembic for database migrations
- [ ] Migrate `biomarker_references.json` to database (keep JSON as seed data)
### 1.4 Project Structure Refactor
Reorganize to match production patterns:
```
src/
β”œβ”€β”€ config.py # Pydantic Settings (hierarchical)
β”œβ”€β”€ main.py # FastAPI app with lifespan
β”œβ”€β”€ database.py # Database utilities
β”œβ”€β”€ dependencies.py # FastAPI dependency injection
β”œβ”€β”€ exceptions.py # Domain exception hierarchy
β”œβ”€β”€ middlewares.py # Request logging, timing
β”œβ”€β”€ db/ # Database layer
β”‚ β”œβ”€β”€ factory.py
β”‚ └── interfaces/
β”œβ”€β”€ models/ # SQLAlchemy models
β”‚ β”œβ”€β”€ analysis.py
β”‚ └── document.py
β”œβ”€β”€ repositories/ # Data access
β”‚ β”œβ”€β”€ analysis.py
β”‚ └── document.py
β”œβ”€β”€ routers/ # API endpoints
β”‚ β”œβ”€β”€ analyze.py # Biomarker analysis
β”‚ β”œβ”€β”€ ask.py # RAG Q&A (streaming + standard)
β”‚ β”œβ”€β”€ health.py # Comprehensive health checks
β”‚ └── search.py # Medical document search
β”œβ”€β”€ schemas/ # Pydantic request/response models
β”‚ β”œβ”€β”€ api/
β”‚ β”œβ”€β”€ medical/
β”‚ └── embeddings/
β”œβ”€β”€ services/ # Business logic
β”‚ β”œβ”€β”€ agents/ # Your 6 medical agents (KEEP!)
β”‚ β”‚ β”œβ”€β”€ biomarker_analyzer.py
β”‚ β”‚ β”œβ”€β”€ disease_explainer.py
β”‚ β”‚ β”œβ”€β”€ biomarker_linker.py
β”‚ β”‚ β”œβ”€β”€ clinical_guidelines.py
β”‚ β”‚ β”œβ”€β”€ confidence_assessor.py
β”‚ β”‚ β”œβ”€β”€ response_synthesizer.py
β”‚ β”‚ β”œβ”€β”€ agentic_rag.py # NEW: LangGraph agentic wrapper
β”‚ β”‚ β”œβ”€β”€ nodes/ # NEW: Guardrail, grading, rewriting
β”‚ β”‚ β”œβ”€β”€ state.py # Enhanced state
β”‚ β”‚ β”œβ”€β”€ context.py # Runtime dependency injection
β”‚ β”‚ └── prompts.py # Medical-domain prompts
β”‚ β”œβ”€β”€ opensearch/ # NEW: Search engine client
β”‚ β”œβ”€β”€ embeddings/ # NEW: Production embeddings
β”‚ β”œβ”€β”€ cache/ # NEW: Redis caching
β”‚ β”œβ”€β”€ langfuse/ # NEW: Observability
β”‚ β”œβ”€β”€ ollama/ # NEW: Local LLM client
β”‚ β”œβ”€β”€ indexing/ # NEW: Chunking + indexing
β”‚ β”œβ”€β”€ pdf_parser/ # Enhanced: Use Docling
β”‚ β”œβ”€β”€ telegram/ # NEW: Bot integration
β”‚ └── biomarker/ # Extracted: validation + normalization
β”œβ”€β”€ evaluation/ # KEEP: 5D evaluation
└── evolution/ # KEEP: SOP evolution
```
**Tasks:**
- [ ] Create the new directory structure
- [ ] Move API from `api/app/` into `src/` (single application)
- [ ] Create `exceptions.py` with medical-domain exception hierarchy
- [ ] Create `dependencies.py` with typed FastAPI dependency injection
- [ ] Create `main.py` with proper lifespan context manager
### 1.5 Development Tooling
**Tasks:**
- [ ] Create `pyproject.toml` replacing `requirements.txt` (use UV)
- [ ] Create `Makefile` with start/stop/test/lint/format/health commands
- [ ] Add `ruff` for linting and formatting
- [ ] Add `mypy` for type checking
- [ ] Add `.pre-commit-config.yaml`
- [ ] Create `.env.example` and `.env.test`
---
## Phase 2: Medical Data Ingestion Pipeline (Week 2 Equivalent)
> **Goal**: Automated ingestion of medical PDFs, clinical guidelines, and reference documents with Airflow orchestration.
### 2.1 Medical PDF Parser Upgrade
Replace basic PyPDF with Docling for better medical document handling:
**Tasks:**
- [ ] Create `src/services/pdf_parser/` with Docling integration (copy course pattern)
- [ ] Add medical-specific section detection (Abstract, Methods, Results, Discussion, Clinical Guidelines)
- [ ] Add table extraction for lab reference ranges
- [ ] Add validation: file size limits, page limits, PDF header check
- [ ] Add metadata extraction: title, authors, publication date, journal
### 2.2 Medical Document Sources
Unlike arXiv (single API), medical literature comes from multiple sources:
**Tasks:**
- [ ] Create `src/services/medical_sources/` package
- [ ] Implement PubMed API client (free, rate-limited) for research papers
- [ ] Implement local PDF upload endpoint for clinical guidelines
- [ ] Implement reference document ingestion (WHO, CDC, ADA guidelines)
- [ ] Create document deduplication logic (by title hash + content fingerprint)
- [ ] Add `MedicalDocument` model tracking: source, parse status, indexing status
### 2.3 Airflow Pipeline for Medical Literature
**Tasks:**
- [ ] Create `airflow/` directory with Dockerfile and entrypoint
- [ ] Create `airflow/dags/medical_ingestion.py` DAG:
- `setup_environment` β†’ `fetch_new_documents` β†’ `parse_pdfs` β†’ `chunk_and_index` β†’ `generate_report`
- [ ] Schedule: Daily at 6 AM for PubMed updates, on-demand for uploaded PDFs
- [ ] Add retry logic with exponential backoff
- [ ] Mount `src/` into Airflow container for shared code
### 2.4 PostgreSQL Storage for Documents
**Tasks:**
- [ ] Create `MedicalDocument` model: id, title, source, source_type, authors, abstract, raw_text, sections, parse_status, indexed_at
- [ ] Create `PaperRepository` with CRUD + upsert + status tracking
- [ ] Track processing pipeline: `uploaded β†’ parsed β†’ chunked β†’ indexed`
- [ ] Store parsed sections as JSON for re-indexing without re-parsing
---
## Phase 3: Production Search Foundation (Week 3 Equivalent)
> **Goal**: Replace FAISS with OpenSearch for production BM25 keyword search with medical-specific optimizations.
### 3.1 OpenSearch Client
**Tasks:**
- [ ] Create `src/services/opensearch/` package (adapt course pattern)
- [ ] Implement `OpenSearchClient` with:
- Health check, index management, BM25 search, bulk indexing
- **Medical-specific**: Boost clinical term matches, support ICD-10 code filtering
- [ ] Create `QueryBuilder` with medical field boosting:
```
fields: ["chunk_text^3", "title^2", "section_title^1.5", "abstract^1"]
```
- [ ] Create `index_config_hybrid.py` with medical document mapping:
- Fields: chunk_text, title, authors, abstract, document_type (guideline/research/reference), condition_tags, publication_year
### 3.2 Medical Document Index Mapping
```python
MEDICAL_CHUNKS_MAPPING = {
"settings": {
"index.knn": True,
"analysis": {
"analyzer": {
"medical_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "medical_synonyms", "stop", "snowball"]
}
}
}
},
"mappings": {
"properties": {
"chunk_text": {"type": "text", "analyzer": "medical_analyzer"},
"document_type": {"type": "keyword"}, # guideline, research, reference
"condition_tags": {"type": "keyword"}, # diabetes, anemia, etc.
"biomarkers_mentioned": {"type": "keyword"}, # Glucose, HbA1c, etc.
"embedding": {"type": "knn_vector", "dimension": 1024},
# ... more fields
}
}
}
```
**Tasks:**
- [ ] Design medical-optimized OpenSearch mapping
- [ ] Add medical synonym analyzer (e.g., "diabetes mellitus" ↔ "DM", "HbA1c" ↔ "glycated hemoglobin")
- [ ] Create search endpoint `POST /api/v1/search` with filtering by document_type, condition_tags
- [ ] Implement BM25 search with medical field boosting
- [ ] Create index verification in startup lifespan
---
## Phase 4: Hybrid Search & Intelligent Chunking (Week 4 Equivalent)
> **Goal**: Section-aware chunking for medical documents + hybrid search (BM25 + semantic) with RRF fusion.
### 4.1 Medical-Aware Text Chunking
**Tasks:**
- [ ] Create `src/services/indexing/text_chunker.py` adapting course's `TextChunker`:
- Section-aware chunking (detect: Introduction, Methods, Results, Discussion, Guidelines, References)
- Target: 600 words per chunk, 100 word overlap
- Medical metadata: section_title, biomarkers_mentioned, condition_tags
- [ ] Create `MedicalTextChunker` subclass with:
- Biomarker mention detection (scan for any of 24+ biomarker names)
- Condition tag extraction (diabetes, anemia, heart disease, etc.)
- Table-aware chunking (keep tables together)
- Reference section filtering (skip bibliography chunks)
- [ ] Create `HybridIndexingService` for chunk β†’ embed β†’ index pipeline
### 4.2 Production Embeddings
**Tasks:**
- [ ] Create `src/services/embeddings/` with Jina AI client (1024d, passage/query differentiation)
- [ ] Add fallback chain: Jina β†’ Google β†’ HuggingFace
- [ ] Implement batch embedding for efficient indexing
- [ ] Track embedding model in chunk metadata for versioning
### 4.3 Hybrid Search with RRF
**Tasks:**
- [ ] Implement `search_unified()` supporting: BM25-only, vector-only, hybrid modes
- [ ] Set up OpenSearch RRF (Reciprocal Rank Fusion) pipeline
- [ ] Create unified search endpoint `POST /api/v1/hybrid-search/`
- [ ] Add min_score filtering and result deduplication
- [ ] Benchmark: BM25 vs. vector vs. hybrid on medical queries
---
## Phase 5: Complete RAG Pipeline with Streaming (Week 5 Equivalent)
> **Goal**: Replace synchronous analysis with streaming RAG, add Gradio UI, optimize prompts.
### 5.1 Ollama Client Upgrade
**Tasks:**
- [ ] Create `src/services/ollama/` package (adapt course pattern)
- [ ] Implement `OllamaClient` with:
- Health check, model listing, generate, streaming generate
- Usage metadata extraction (tokens, latency)
- LangChain integration: `get_langchain_model()` for structured output
- [ ] Create medical-specific RAG prompt templates:
- `rag_medical_system.txt` β€” optimized for medical explanation generation
- Structured output format for clinical responses
- [ ] Create `OllamaFactory` with `@lru_cache`
### 5.2 Streaming RAG Endpoints
**Tasks:**
- [ ] Create `POST /api/v1/ask` β€” standard RAG with medical context retrieval
- [ ] Create `POST /api/v1/stream` β€” SSE streaming for real-time responses
- [ ] Create `POST /api/v1/analyze/stream` β€” streaming biomarker analysis
- [ ] Integrate with existing multi-agent pipeline:
```
Query β†’ Hybrid Search β†’ Medical Chunks β†’ Agent Pipeline β†’ Streaming Response
```
### 5.3 Gradio Medical Interface
**Tasks:**
- [ ] Create `src/gradio_app.py` for interactive medical RAG:
- Biomarker input form (structured entry)
- Natural language input (free text)
- Streaming response display
- Search mode selector (BM25, hybrid, vector)
- Model selector
- Analysis history display
- [ ] Create `gradio_launcher.py` for easy startup
- [ ] Expose on port 7861
### 5.4 Prompt Optimization
**Tasks:**
- [ ] Reduce prompt size by 60-80% (course achieved 80% reduction)
- [ ] Create focused medical prompts (separate: biomarker analysis, disease explanation, guidelines)
- [ ] Test prompt variants using 5D evaluation framework
- [ ] Store best prompts as SOP parameters (tie into evolution engine)
---
## Phase 6: Monitoring, Caching & Observability (Week 6 Equivalent)
> **Goal**: Add Langfuse tracing for the entire pipeline, Redis caching, and production monitoring.
### 6.1 Langfuse Integration
**Tasks:**
- [ ] Create `src/services/langfuse/` package (adapt course pattern):
- `client.py` β€” LangfuseTracer wrapper with v3 SDK
- `factory.py` β€” cached tracer factory
- `tracer.py` β€” medical-specific RAGTracer with named steps
- [ ] Add spans for every pipeline step:
- `biomarker_validation` β†’ `query_embedding` β†’ `search_retrieval` β†’ `agent_execution` β†’ `response_synthesis`
- [ ] Track per-request metrics:
- Total latency, LLM tokens used, search results count, cache hit/miss, agent execution time
- [ ] Add Langfuse Docker services to docker-compose.yml
- [ ] Create trace visualization for medical analysis pipeline
### 6.2 Redis Caching
**Tasks:**
- [ ] Create `src/services/cache/` package (adapt course pattern):
- Exact-match cache: SHA256(query + model + top_k + biomarkers) β†’ cached response
- TTL: 6 hours for general queries, 1 hour for biomarker analysis (values may change)
- [ ] Add caching to:
- `/api/v1/ask` β€” cache RAG responses
- `/api/v1/analyze` β€” cache full analysis results
- Embeddings β€” cache frequently queried embeddings
- [ ] Add graceful fallback: cache miss β†’ normal pipeline
- [ ] Track cache hit rates in Langfuse
### 6.3 Production Health Dashboard
**Tasks:**
- [ ] Enhance `/api/v1/health` to check all services:
- PostgreSQL, OpenSearch, Redis, Ollama, Langfuse, Airflow
- [ ] Add `/api/v1/metrics` endpoint for operational metrics
- [ ] Create Langfuse dashboard for:
- Average response time, cache hit rate, error rate, token costs
- Per-agent execution times, search relevance scores
---
## Phase 7: Agentic RAG & Messaging Bot (Week 7 Equivalent)
> **Goal**: Wrap your multi-agent pipeline in a LangGraph agentic workflow with guardrails, document grading, and query rewriting. Add Telegram bot for mobile access.
### 7.1 Agentic RAG Wrapper
This is the most impactful upgrade β€” it adds **intelligence around your existing agents**:
```
User Query
↓
[GUARDRAIL] ──── Is this a medical/biomarker question? ────→ [OUT OF SCOPE]
↓ yes
[RETRIEVE] ──── Hybrid search for medical documents ────→ [TOOL: search]
↓
[GRADE DOCUMENTS] ──── Are results relevant? ────→ [REWRITE QUERY] ──→ loop
↓ yes
[CLINICAL ANALYSIS] ──── Your 6 medical agents ────→ structured analysis
↓
[GENERATE RESPONSE] ──── Synthesize with citations ────→ final answer
```
**Tasks:**
- [ ] Create `src/services/agents/agentic_rag.py` β€” `AgenticRAGService` class
- [ ] Create `src/services/agents/nodes/`:
- `guardrail_node.py` β€” Medical domain validation (score 0-100)
- In-scope: biomarker questions, disease queries, clinical guidelines
- Out-of-scope: non-medical, general knowledge, harmful content
- `retrieve_node.py` β€” Creates tool call with `max_retrieval_attempts`
- `grade_documents_node.py` β€” LLM evaluates medical relevance
- `rewrite_query_node.py` β€” LLM rewrites for better medical retrieval
- `generate_answer_node.py` β€” Uses your existing agent pipeline OR direct LLM
- `out_of_scope_node.py` β€” Polite medical-domain rejection
- [ ] Create `src/services/agents/state.py` β€” Enhanced state with guardrail_result, routing_decision, grading_results
- [ ] Create `src/services/agents/context.py` β€” Runtime context for dependency injection
- [ ] Create `src/services/agents/prompts.py` β€” Medical-specific prompts:
- Guardrail: "Is this about health/biomarkers/medical conditions?"
- Grading: "Does this medical document answer the clinical question?"
- Rewriting: "Improve this medical query for better document retrieval"
- Generation: "Synthesize medical findings with citations and safety caveats"
- [ ] Create `src/services/agents/tools.py` β€” Medical retriever tool wrapping OpenSearch
- [ ] Create `POST /api/v1/ask-agentic` endpoint
- [ ] Add Langfuse tracing to every node
### 7.2 Medical Guardrails (Critical for MedTech)
Beyond the course's simple domain check, add medical-specific safety:
**Tasks:**
- [ ] **Input guardrails**:
- Detect harmful queries (self-harm, drug abuse guidance)
- Detect attempts to get diagnosis without proper data
- Validate biomarker values are physiologically plausible
- [ ] **Output guardrails**:
- Always include "consult your healthcare provider" disclaimer
- Never provide definitive diagnosis (always "suggests" / "may indicate")
- Flag critical biomarker values with immediate action advice
- Ensure safety_alerts are present for out-of-range values
- [ ] **Citation guardrails**:
- Ensure all medical claims have document citations
- Flag unsupported claims
### 7.3 Telegram Bot Integration
**Tasks:**
- [ ] Create `src/services/telegram/` package (adapt course pattern)
- [ ] Implement bot commands:
- `/start` β€” Welcome with medical assistant introduction
- `/help` β€” Show capabilities and input format
- `/analyze <biomarker values>` β€” Quick biomarker analysis
- `/search <medical query>` β€” Search medical documents
- `/report` β€” Get last analysis as formatted report
- Free text β€” Full RAG Q&A about medical topics
- [ ] Add typing indicators and progress messages
- [ ] Integrate caching for repeated queries
- [ ] Add rate limiting (medical queries shouldn't be spammed)
- [ ] Create `TelegramFactory` gated by `TELEGRAM__ENABLED=true`
### 7.4 Feedback Loop
**Tasks:**
- [ ] Create `POST /api/v1/feedback` endpoint (adapt from course)
- [ ] Integrate with Langfuse scoring
- [ ] Use feedback data to identify weak prompts β†’ feed into SOP evolution engine
---
## Phase 8: MedTech-Specific Additions (Beyond Course)
> **Goal**: Things the course doesn't cover but your medical domain demands.
### 8.1 HIPAA-Awareness Patterns
**Tasks:**
- [ ] Never log patient biomarker values in plain text
- [ ] Add request ID tracking without PII
- [ ] Create data retention policy (auto-delete analysis data after configurable period)
- [ ] Add audit logging for all analysis requests
- [ ] Document HIPAA compliance approach (even if not yet certified)
### 8.2 Medical Safety Testing
**Tasks:**
- [ ] Create medical-specific test suite:
- Critical value detection tests (every critical biomarker)
- Guardrail rejection tests (non-medical queries)
- Citation completeness tests
- Safety disclaimer presence tests
- Biomarker normalization tests (already have some)
- [ ] Integrate 5D evaluation into CI pipeline
- [ ] Create test fixtures with realistic medical scenarios
### 8.3 Evolution Engine Integration
**Tasks:**
- [ ] Wire SOP evolution engine to production metrics (Langfuse data)
- [ ] Create Airflow DAG for scheduled evolution cycles
- [ ] Store evolved SOPs in PostgreSQL with version tracking
- [ ] A/B test SOP variants using Langfuse trace comparison
### 8.4 Multi-condition Support
**Tasks:**
- [ ] Extend condition coverage beyond current 5 diseases
- [ ] Add condition-specific retrieval strategies
- [ ] Create condition-specific chunking filters
- [ ] Support multi-condition analysis (comorbidities)
---
## Implementation Priority Matrix
| Priority | Phase | Effort | Impact | Dependencies |
|----------|-------|--------|--------|--------------|
| πŸ”΄ P0 | 1.1 Docker Compose | 2 days | Critical | None |
| πŸ”΄ P0 | 1.2 Pydantic Settings | 1 day | Critical | None |
| πŸ”΄ P0 | 1.4 Project Restructure | 2 days | Critical | None |
| πŸ”΄ P0 | 1.5 Dev Tooling | 0.5 day | Critical | 1.4 |
| πŸ”΄ P0 | 1.3 PostgreSQL + Models | 2 days | Critical | 1.1, 1.4 |
| 🟑 P1 | 3.1 OpenSearch Client | 2 days | High | 1.1, 1.4 |
| 🟑 P1 | 3.2 Medical Index Mapping | 1 day | High | 3.1 |
| 🟑 P1 | 4.1 Medical Text Chunker | 2 days | High | 3.1 |
| 🟑 P1 | 4.2 Production Embeddings | 1 day | High | 4.1 |
| 🟑 P1 | 4.3 Hybrid Search + RRF | 1 day | High | 3.1, 4.2 |
| 🟑 P1 | 5.1 Ollama Client | 1 day | High | 1.4 |
| 🟑 P1 | 5.2 Streaming Endpoints | 1 day | High | 5.1, 4.3 |
| 🟑 P1 | 2.1 PDF Parser (Docling) | 1 day | High | 1.4 |
| 🟑 P1 | 7.1 Agentic RAG Wrapper | 3 days | High | 5.2, 4.3 |
| 🟑 P1 | 7.2 Medical Guardrails | 2 days | High | 7.1 |
| 🟒 P2 | 2.3 Airflow Pipeline | 2 days | Medium | 1.1, 2.1, 4.1 |
| 🟒 P2 | 5.3 Gradio Interface | 1 day | Medium | 5.2 |
| 🟒 P2 | 6.1 Langfuse Tracing | 2 days | Medium | 1.1, 5.2 |
| 🟒 P2 | 6.2 Redis Caching | 1 day | Medium | 1.1, 5.2 |
| 🟒 P2 | 6.3 Health Dashboard | 0.5 day | Medium | 6.1 |
| 🟒 P2 | 7.3 Telegram Bot | 2 days | Medium | 7.1, 6.2 |
| 🟒 P2 | 7.4 Feedback Loop | 0.5 day | Medium | 6.1 |
| πŸ”΅ P3 | 2.2 Medical Sources | 2 days | Low | 2.1 |
| πŸ”΅ P3 | 8.1 HIPAA Patterns | 1 day | Low | 1.3 |
| πŸ”΅ P3 | 8.2 Safety Testing | 2 days | Low | 7.2 |
| πŸ”΅ P3 | 8.3 Evolution Integration | 2 days | Low | 6.1, 2.3 |
| πŸ”΅ P3 | 8.4 Multi-condition | 3 days | Low | 4.1 |
**Estimated Total: ~40 days of focused work**
---
## Migration Strategy
### Step 1: Foundation (Week 1-2 of work)
1. Restructure project layout β†’ Phase 1.4
2. Create Pydantic Settings β†’ Phase 1.2
3. Set up Docker Compose β†’ Phase 1.1
4. Add PostgreSQL with models β†’ Phase 1.3
5. Add dev tooling β†’ Phase 1.5
### Step 2: Search Engine (Week 2-3)
6. Create OpenSearch client + medical mapping β†’ Phase 3.1, 3.2
7. Build medical text chunker β†’ Phase 4.1
8. Add production embeddings (Jina) β†’ Phase 4.2
9. Implement hybrid search + RRF β†’ Phase 4.3
10. Upgrade PDF parser to Docling β†’ Phase 2.1
### Step 3: RAG Pipeline (Week 3-4)
11. Create Ollama client β†’ Phase 5.1
12. Add streaming endpoints β†’ Phase 5.2
13. Build agentic RAG wrapper β†’ Phase 7.1
14. Add medical guardrails β†’ Phase 7.2
15. Create Gradio interface β†’ Phase 5.3
### Step 4: Production Hardening (Week 4-5)
16. Add Langfuse observability β†’ Phase 6.1
17. Add Redis caching β†’ Phase 6.2
18. Set up Airflow pipeline β†’ Phase 2.3
19. Build Telegram bot β†’ Phase 7.3
20. Add feedback loop β†’ Phase 7.4
### Step 5: Polish (Week 5-6)
21. Health dashboard β†’ Phase 6.3
22. Medical safety testing β†’ Phase 8.2
23. HIPAA patterns β†’ Phase 8.1
24. Evolution engine integration β†’ Phase 8.3
### Key Migration Rules
- **Never break what works**: Keep all existing agents functional throughout
- **Test at every step**: Run existing tests after each phase
- **Incremental Docker**: Start with API + PostgreSQL, add services one at a time
- **Feature flags**: Gate new features (Telegram, Langfuse, Redis) behind settings
- **Backward compatibility**: Keep CLI chatbot working alongside new API
---
## Architecture Target State
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Docker Compose Orchestration β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ FastAPI β”‚ β”‚PostgreSQL β”‚ β”‚ OpenSearch β”‚ β”‚ Ollama β”‚ β”‚ Airflow β”‚ β”‚
β”‚ β”‚ + Gradio β”‚ β”‚ (reports, β”‚ β”‚ (hybrid β”‚ β”‚ (local β”‚ β”‚ (daily β”‚ β”‚
β”‚ β”‚ (8000, β”‚ β”‚ docs, β”‚ β”‚ medical β”‚ β”‚ LLM) β”‚ β”‚ ingest) β”‚ β”‚
β”‚ β”‚ 7861) β”‚ β”‚ history) β”‚ β”‚ search) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β” β”‚
β”‚ β”‚ Redis β”‚ β”‚ Langfuse β”‚ β”‚ mediguard-network β”‚ β”‚
β”‚ β”‚ (cache) β”‚ β”‚ (observe) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Agentic RAG Pipeline β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Query β†’ [Guardrail] β†’ [Retrieve] β†’ [Grade] β†’ [6 Medical Agents] β”‚ β”‚
β”‚ β”‚ ↓ ↑ ↓ ↓ β”‚ β”‚
β”‚ β”‚ [Out of Scope] [Rewrite] [Generate] β†’ Final Response β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Agents: Biomarker Analyzer β”‚ Disease Explainer β”‚ Linker β”‚ β”‚
β”‚ β”‚ Clinical Guidelines β”‚ Confidence β”‚ Synthesizer β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Telegram Bot β”‚ β”‚ Gradio UI β”‚ β”‚ 5D Eval + SOP Evolution β”‚ β”‚
β”‚ β”‚ (mobile) β”‚ β”‚ (desktop) β”‚ β”‚ (self-improvement loop) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## Files to Create (Summary)
| New File | Source of Inspiration |
|----------|----------------------|
| `docker-compose.yml` | Course `compose.yml` (adapted) |
| `Dockerfile` | Course `Dockerfile` (multi-stage UV) |
| `Makefile` | Course `Makefile` |
| `pyproject.toml` | Course `pyproject.toml` |
| `.pre-commit-config.yaml` | Course `.pre-commit-config.yaml` |
| `.env.example` | Course `.env.example` |
| `src/main.py` | Course `src/main.py` (lifespan pattern) |
| `src/config.py` | Course `src/config.py` + existing SOP config |
| `src/dependencies.py` | Course `src/dependencies.py` |
| `src/exceptions.py` | Course `src/exceptions.py` (medical exceptions) |
| `src/database.py` | Course `src/database.py` |
| `src/db/*` | Course `src/db/*` |
| `src/models/analysis.py` | New (medical domain) |
| `src/models/document.py` | Course `src/models/paper.py` (adapted) |
| `src/repositories/*` | Course `src/repositories/*` (adapted) |
| `src/routers/ask.py` | Course `src/routers/ask.py` |
| `src/routers/search.py` | Course `src/routers/hybrid_search.py` |
| `src/routers/health.py` | Course `src/routers/ping.py` (enhanced) |
| `src/schemas/*` | Course `src/schemas/*` (medical schemas) |
| `src/services/opensearch/*` | Course `src/services/opensearch/*` |
| `src/services/embeddings/*` | Course `src/services/embeddings/*` |
| `src/services/ollama/*` | Course `src/services/ollama/*` |
| `src/services/cache/*` | Course `src/services/cache/*` |
| `src/services/langfuse/*` | Course `src/services/langfuse/*` |
| `src/services/indexing/*` | Course `src/services/indexing/*` (medical chunks) |
| `src/services/pdf_parser/*` | Course `src/services/pdf_parser/*` |
| `src/services/telegram/*` | Course `src/services/telegram/*` |
| `src/services/agents/agentic_rag.py` | Course (adapted for medical agents) |
| `src/services/agents/nodes/*` | Course (medical guardrails) |
| `src/services/agents/context.py` | Course |
| `src/services/agents/prompts.py` | Course (medical prompts) |
| `src/gradio_app.py` | Course `src/gradio_app.py` (medical UI) |
| `airflow/dags/medical_ingestion.py` | Course `airflow/dags/arxiv_paper_ingestion.py` |
## Files to Keep & Enhance
| Existing File | Action |
|---------------|--------|
| `src/agents/biomarker_analyzer.py` | Keep, move to `src/services/agents/medical/` |
| `src/agents/disease_explainer.py` | Keep, move, add OpenSearch retriever |
| `src/agents/biomarker_linker.py` | Keep, move, add OpenSearch retriever |
| `src/agents/clinical_guidelines.py` | Keep, move, add OpenSearch retriever |
| `src/agents/confidence_assessor.py` | Keep, move |
| `src/agents/response_synthesizer.py` | Keep, move |
| `src/biomarker_validator.py` | Keep, move to `src/services/biomarker/` |
| `src/biomarker_normalization.py` | Keep, move to `src/services/biomarker/` |
| `src/evaluation/` | Keep, enhance with Langfuse integration |
| `src/evolution/` | Keep, wire to production metrics |
| `config/biomarker_references.json` | Keep as seed data, migrate to DB |
| `scripts/chat.py` | Keep, update imports |
| `tests/*` | Keep, add production test fixtures |
---
*This plan transforms MediGuard AI from a working prototype into a production-grade medical RAG system, applying every infrastructure lesson from the arXiv Paper Curator course while preserving and enhancing your unique medical domain logic.*