Spaces:

bshepp
/

cds-agent

Running

bshepp commited on Feb 14

Commit

c28dd56

1 Parent(s): e684a6c

Update all documentation for conflict detection feature

- README.md: 5→6 step pipeline, updated architecture diagram, project
structure (new conflict_detection.py), usage instructions
- docs/architecture.md: New Step 5 section, updated diagram, data models
table (3 new models), component descriptions, agentic comparison
- DEVELOPMENT_LOG.md: Added Phase 8 documenting the conflict detection
design decision (why confidence scores were dropped) and full
implementation details
- docs/writeup_draft.md: Updated pipeline description, architecture
diagram, performance table, practical usage section
- docs/test_results.md: Updated E2E test to reflect 6-step pipeline

Files changed (5) hide show

DEVELOPMENT_LOG.md +48 -0
README.md +28 -26
docs/architecture.md +44 -24
docs/test_results.md +4 -3
docs/writeup_draft.md +10 -7

DEVELOPMENT_LOG.md CHANGED Viewed

@@ -173,6 +173,54 @@ Rewrote/created all documentation:
 ---
 ## Dependency Inventory
 ### Python Backend (`requirements.txt`)

 ---
+## Phase 8: Conflict Detection Feature
+### Design Decision: Drop Confidence Scores, Add Conflict Detection
+During review, identified that the system's "confidence" was just the LLM picking a label (LOW/MODERATE/HIGH) — not a calibrated score. Composite numeric confidence scores were considered and **rejected** because:
+- Uncalibrated confidence values are dangerous (clinician anchoring bias)
+- No training data exists to calibrate outputs
+- A single number hides more than it reveals
+**Instead, added Conflict Detection** — a new pipeline step that compares guideline recommendations against the patient's actual data to identify specific, actionable gaps. This provides direct patient safety value without requiring calibration.
+### Implementation
+**New models added to `schemas.py`:**
+- `ConflictType` enum — 6 categories: omission, contradiction, dosage, monitoring, allergy_risk, interaction_gap
+- `ClinicalConflict` model — Each conflict has: type, severity, guideline_source, guideline_text, patient_data, description, suggested_resolution
+- `ConflictDetectionResult` — List of conflicts + summary + guidelines_checked count
+- `conflicts` field added to `CDSReport`
+- `conflict_detection` field added to `AgentState`
+**New tool: `conflict_detection.py`:**
+- Takes patient profile, clinical reasoning, drug interactions, and guidelines
+- Uses MedGemma at low temperature (0.1) for safety-critical analysis
+- Returns structured `ConflictDetectionResult` with specific, actionable conflicts
+- Graceful degradation: returns empty if no guidelines available
+**Pipeline changes (`orchestrator.py`):**
+- Pipeline expanded from 5 to 6 steps
+- New Step 5: Conflict Detection (between guideline retrieval and synthesis)
+- Synthesis (now Step 6) receives conflict data and prominently includes it in the report
+**Synthesis changes (`synthesis.py`):**
+- Accepts `conflict_detection` parameter
+- New "Conflicts & Gaps" section in synthesis prompt
+- Fallback: copies detected conflicts directly into report if LLM doesn't populate the structured field
+**Frontend changes (`CDSReport.tsx`):**
+- New "Conflicts & Gaps Detected" section with high visual prominence
+- Red border container, severity-coded left-accent cards (critical=red, high=orange, moderate=yellow, low=blue)
+- Side-by-side "Guideline says" vs "Patient data" comparison
+- Green-highlighted suggested resolutions
+- Positioned immediately after drug interactions for maximum visibility
+**Files created:** `src/backend/app/tools/conflict_detection.py` (1 new file)
+**Files modified:** `schemas.py`, `orchestrator.py`, `synthesis.py`, `CDSReport.tsx` (4 files)
+---
 ## Dependency Inventory
 ### Python Backend (`requirements.txt`)

README.md CHANGED Viewed

@@ -15,35 +15,36 @@ A clinician pastes a patient case. The system automatically:
 2. **Reasons** about the case to generate a ranked differential diagnosis with chain-of-thought transparency
 3. **Checks drug interactions** against OpenFDA and RxNorm databases
 4. **Retrieves clinical guidelines** from a 62-guideline RAG corpus spanning 14 medical specialties
-5. **Synthesizes** everything into a structured CDS report with recommendations, warnings, and citations
-All five steps stream to the frontend in real time via WebSocket — the clinician sees each step execute live.
 ---
 ## System Architecture
 ```
-┌─────────────────────────────────────────────────────────────────┐
-│                    FRONTEND (Next.js 14 + React)                │
-│  Patient Case Input  │  Agent Activity Feed  │  CDS Report View │
-└──────────────────────────┬──────────────────────────────────────┘
                            │ REST API + WebSocket
-┌──────────────────────────▼──────────────────────────────────────┐
-│                     BACKEND (FastAPI + Python 3.10)              │
-│                                                                  │
-│  ┌────────────────────────────────────────────────────────────┐  │
-│  │              ORCHESTRATOR (5-Step Pipeline)                 │  │
-│  └─────┬──────────┬──────────┬──────────┬──────────┬─────────┘  │
-│   ┌────▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────┐ ┌───▼─────┐      │
-│   │ Parse  │ │Reason  │ │ Drug │ │  RAG   │ │Synth-   │      │
-│   │Patient │ │(LLM)   │ │Check │ │Guide-  │ │esize    │      │
-│   │Data    │ │Differ- │ │OpenFDA│ │lines   │ │(LLM)    │      │
-│   │        │ │ential  │ │RxNorm │ │ChromaDB│ │Report   │      │
-│   └────────┘ └────────┘ └──────┘ └────────┘ └─────────┘      │
-│                                                                  │
-│  External: OpenFDA API │ RxNorm/NLM API │ ChromaDB (local)      │
-└──────────────────────────────────────────────────────────────────┘
 ```
 See [docs/architecture.md](docs/architecture.md) for the full design document.
@@ -136,9 +137,9 @@ medgemma_impact_challenge/
 │   │       ├── config.py             # Pydantic Settings (ports, models, dirs)
 │   │       ├── __init__.py
 │   │       ├── models/
-│   │       │   └── schemas.py        # All Pydantic models (~238 lines)
 │   │       ├── agent/
-│   │       │   └── orchestrator.py   # 5-step pipeline orchestrator (267 lines)
 │   │       ├── services/
 │   │       │   └── medgemma.py       # LLM service (OpenAI-compatible API)
 │   │       ├── tools/
@@ -146,7 +147,8 @@ medgemma_impact_challenge/
 │   │       │   ├── clinical_reasoning.py  # Step 2: Differential diagnosis
 │   │       │   ├── drug_interactions.py   # Step 3: OpenFDA + RxNorm
 │   │       │   ├── guideline_retrieval.py # Step 4: RAG over ChromaDB
-│   │       │   └── synthesis.py           # Step 5: CDS report generation
 │   │       ├── data/
 │   │       │   └── clinical_guidelines.json  # 62 guidelines, 14 specialties
 │   │       └── api/
@@ -240,8 +242,8 @@ python test_clinical_cases.py --report results.json  # Save results
 1. Open `http://localhost:3000`
 2. Paste a patient case description (or click a sample case)
 3. Click **"Analyze Patient Case"**
-4. Watch the 5-step agent pipeline execute in real time
-5. Review the CDS report: differential diagnosis, drug warnings, guideline recommendations, next steps
 ---

 2. **Reasons** about the case to generate a ranked differential diagnosis with chain-of-thought transparency
 3. **Checks drug interactions** against OpenFDA and RxNorm databases
 4. **Retrieves clinical guidelines** from a 62-guideline RAG corpus spanning 14 medical specialties
+5. **Detects conflicts** between guideline recommendations and the patient's actual data — surfacing omissions, contradictions, dosage concerns, and monitoring gaps
+6. **Synthesizes** everything into a structured CDS report with recommendations, warnings, conflicts, and citations
+All six steps stream to the frontend in real time via WebSocket — the clinician sees each step execute live.
 ---
 ## System Architecture
 ```
+┌─────────────────────────────────────────────────────────────────────┐
+│                    FRONTEND (Next.js 14 + React)                    │
+│  Patient Case Input  │  Agent Activity Feed  │  CDS Report View    │
+└──────────────────────────┬──────────────────────────────────────────┘
                            │ REST API + WebSocket
+┌──────────────────────────▼──────────────────────────────────────────┐
+│                     BACKEND (FastAPI + Python 3.10)                  │
+│                                                                      │
+│  ┌────────────────────────────────────────────────────────────────┐  │
+│  │                ORCHESTRATOR (6-Step Pipeline)                  │  │
+│  └──┬──────────┬──────────┬──────────┬──────────┬──────────┬─────┘  │
+│  ┌──▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────┐ ┌───▼─────┐ ┌���─▼────┐  │
+│  │Parse │ │Reason  │ │ Drug │ │  RAG   │ │Conflict │ │Synth- │  │
+│  │Pati- │ │(LLM)   │ │Check │ │Guide-  │ │Detect-  │ │esize  │  │
+│  │ent   │ │Differ- │ │OpenFDA│ │lines   │ │ion      │ │(LLM)  │  │
+│  │Data  │ │ential  │ │RxNorm │ │ChromaDB│ │(LLM)    │ │Report │  │
+│  └──────┘ └────────┘ └──────┘ └────────┘ └─────────┘ └───────┘  │
+│                                                                      │
+│  External: OpenFDA API │ RxNorm/NLM API │ ChromaDB (local)          │
+└──────────────────────────────────────────────────────────────────────┘
 ```
 See [docs/architecture.md](docs/architecture.md) for the full design document.
 │   │       ├── config.py             # Pydantic Settings (ports, models, dirs)
 │   │       ├── __init__.py
 │   │       ├── models/
+│   │       │   └── schemas.py        # All Pydantic models (~280 lines)
 │   │       ├── agent/
+│   │       │   └── orchestrator.py   # 6-step pipeline orchestrator (~300 lines)
 │   │       ├── services/
 │   │       │   └── medgemma.py       # LLM service (OpenAI-compatible API)
 │   │       ├── tools/
 │   │       │   ├── clinical_reasoning.py  # Step 2: Differential diagnosis
 │   │       │   ├── drug_interactions.py   # Step 3: OpenFDA + RxNorm
 │   │       │   ├── guideline_retrieval.py # Step 4: RAG over ChromaDB
+│   │       │   ├── conflict_detection.py  # Step 5: Guideline vs patient conflicts
+│   │       │   └── synthesis.py           # Step 6: CDS report generation
 │   │       ├── data/
 │   │       │   └── clinical_guidelines.json  # 62 guidelines, 14 specialties
 │   │       └── api/
 1. Open `http://localhost:3000`
 2. Paste a patient case description (or click a sample case)
 3. Click **"Analyze Patient Case"**
+4. Watch the 6-step agent pipeline execute in real time
+5. Review the CDS report: differential diagnosis, drug warnings, **conflicts & gaps**, guideline recommendations, next steps
 ---

docs/architecture.md CHANGED Viewed

@@ -29,19 +29,19 @@ structured clinical decision support report — all in seconds.
 │                  BACKEND (FastAPI + Python 3.10)                 │
 │  Port 8000 (default) / 8002 (dev)                               │
 │                                                                  │
-│  ┌────────────────────────────────────────────────────────────┐  │
-│  │            ORCHESTRATOR (orchestrator.py, 267 lines)       │  │
-│  │  Sequential 5-step pipeline with structured state passing  │  │
-│  └─────┬──────────┬──────────┬──────────┬──────────┬─────────┘  │
-│        │          │          │          │          │              │
-│   ┌────▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────┐ ┌───▼─────┐      │
-│   │Step 1  │ │Step 2  │ │Step 3 │ │Step 4  │ │Step 5   │      │
-│   │Patient │ │Clinical│ │Drug   │ │Guide-  │ │Synthe-  │      │
-│   │Parser  │ │Reason- │ │Inter- │ │line    │ │sis      │      │
-│   │        │ │ing     │ │action │ │Retriev-│ │Agent    │      │
-│   │(LLM)   │ │(LLM)   │ │(APIs) │ │al(RAG) │ │(LLM)    │      │
-│   └────────┘ └────────┘ └──┬───┘ └──┬─────┘ └─────────┘      │
-│                             │        │                           │
 │                        ┌────▼────┐ ┌─▼──────────────┐           │
 │                        │OpenFDA  │ │ChromaDB         │           │
 │                        │RxNorm   │ │62 guidelines    │           │
@@ -100,20 +100,37 @@ LLM: gemma-3-27b-it via Google AI Studio
   - **Fallback:** If `clinical_guidelines.json` is missing, falls back to 2 minimal embedded guidelines
 - **Timing:** ~9.6 s (observed)
-### Step 5: Synthesis Agent (`synthesis.py`)
-- **Input:** All outputs from Steps 1–4
 - **Output:** `CDSReport` (comprehensive structured report)
 - **Report sections:**
   - Patient summary
   - Differential diagnosis with reasoning chains
   - Drug interaction warnings with severity
   - Guideline-concordant recommendations with citations
   - Suggested next steps (immediate, short-term, long-term)
-  - Confidence levels and caveats
 - **Timing:** ~25.3 s (observed)
-**Total pipeline time:** ~75 s for a complex case (all 5 steps sequential).
 ---
@@ -148,7 +165,7 @@ This preserves the intended behavior while staying compatible with Gemma's API c
 ## Data Models (Pydantic v2)
-All pipeline data is strongly typed via Pydantic models in `schemas.py` (~238 lines):
 | Model | Purpose |
 |-------|---------|
@@ -160,7 +177,10 @@ All pipeline data is strongly typed via Pydantic models in `schemas.py` (~238 li
 | `DrugInteractionResult` | Step 3 output: all interaction data |
 | `GuidelineExcerpt` | Individual guideline citation |
 | `GuidelineRetrievalResult` | Step 4 output: relevant guidelines |
-| `CDSReport` | Step 5 output: full synthesized report |
 | `AgentStep` | WebSocket message: step name, status, data, timing |
 ---
@@ -178,8 +198,8 @@ All pipeline data is strongly typed via Pydantic models in `schemas.py` (~238 li
 | Component | Role |
 |-----------|------|
 | `PatientInput.tsx` | Text area for patient case + 3 pre-loaded sample cases (chest pain, DKA, pediatric fever) |
-| `AgentPipeline.tsx` | Visualizes the 5-step pipeline in real time — shows status (pending / running / complete / error) for each step as WebSocket messages arrive |
-| `CDSReport.tsx` | Renders the final CDS report: patient summary, differentials, drug warnings, guidelines, next steps |
 ### Communication
@@ -215,8 +235,8 @@ All pipeline data is strongly typed via Pydantic models in `schemas.py` (~238 li
 | Characteristic | Chatbot | This Agent System |
 |----------------|---------|-------------------|
-| Tool use | None | 4+ specialized tools (parser, drug API, RAG, synthesis) |
-| Planning | None | Orchestrator executes a defined 5-step plan |
 | State management | Stateless | Patient context flows through all steps |
 | Error handling | Generic | Tool-specific fallbacks, graceful degradation |
 | Output structure | Free text | Pydantic-validated, structured, cited |

 │                  BACKEND (FastAPI + Python 3.10)                 │
 │  Port 8000 (default) / 8002 (dev)                               │
 │                                                                  │
+│  ┌────────────────────────────────────────────────────────────────────┐  │
+│  │            ORCHESTRATOR (orchestrator.py, ~300 lines)              │  │
+│  │  Sequential 6-step pipeline with structured state passing         │  │
+│  └──┬──────────┬──────────┬──────────┬──────────┬──────────┬────────┘  │
+│     │          │          │          │          │          │            │
+│  ┌──▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────┐ ┌───▼─────┐ ┌──▼──────┐   │
+│  │Step 1│ │Step 2  │ │Step 3│ │Step 4  │ │Step 5   │ │Step 6   │   │
+│  │Pati- │ │Clini-  │ │Drug  │ │Guide-  │ │Conflict │ │Synthe-  │   │
+│  │ent   │ │cal     │ │Inter-│ │line    │ │Detect-  │ │sis      │   │
+│  │Parser│ │Reason- │ │action│ │Retriev-│ │ion      │ │Agent    │   │
+│  │(LLM) │ │ing     │ │(APIs)│ │al(RAG) │ │(LLM)    │ │(LLM)    │   │
+│  └──────┘ │(LLM)   │ └──┬──┘ └──┬─────┘ └─────────┘ └─────────┘   │
+│           └────────┘    │       │                                    │
 │                        ┌────▼────┐ ┌─▼──────────────┐           │
 │                        │OpenFDA  │ │ChromaDB         │           │
 │                        │RxNorm   │ │62 guidelines    │           │
   - **Fallback:** If `clinical_guidelines.json` is missing, falls back to 2 minimal embedded guidelines
 - **Timing:** ~9.6 s (observed)
+### Step 5: Conflict Detection (`conflict_detection.py`)
+- **Input:** Patient profile, clinical reasoning, drug interactions, and retrieved guidelines from Steps 1–4
+- **Output:** `ConflictDetectionResult` with specific `ClinicalConflict` items
+- **Method:** LLM-based comparison of guideline recommendations against the patient's actual data
+- **Conflict types detected:**
+  - **Omission** — Guideline recommends something the patient is not receiving
+  - **Contradiction** — Patient's current treatment conflicts with guideline advice
+  - **Dosage** — Guideline specifies dose adjustments that apply to this patient (age, renal function, etc.)
+  - **Monitoring** — Guideline requires monitoring that is not documented as ordered
+  - **Allergy Risk** — Guideline-recommended treatment involves a medication the patient is allergic to
+  - **Interaction Gap** — Known drug interaction is not addressed in the care plan
+- **Each conflict includes:** severity (critical/high/moderate/low), guideline source, guideline text, patient data, description, and suggested resolution
+- **Temperature:** 0.1 (low, for safety-critical analysis)
+- **Graceful degradation:** Returns empty result if no guidelines were retrieved (Step 4 skipped/failed)
+### Step 6: Synthesis Agent (`synthesis.py`)
+- **Input:** All outputs from Steps 1–4 plus conflict detection results
 - **Output:** `CDSReport` (comprehensive structured report)
 - **Report sections:**
   - Patient summary
   - Differential diagnosis with reasoning chains
   - Drug interaction warnings with severity
+  - **Conflicts & gaps** — prominently featured with guideline vs patient data comparison
   - Guideline-concordant recommendations with citations
   - Suggested next steps (immediate, short-term, long-term)
+  - Caveats and limitations
 - **Timing:** ~25.3 s (observed)
+**Total pipeline time:** ~75–85 s for a complex case (6 steps, with Steps 3–4 parallel).
 ---
 ## Data Models (Pydantic v2)
+All pipeline data is strongly typed via Pydantic models in `schemas.py` (~280 lines):
 | Model | Purpose |
 |-------|---------|
 | `DrugInteractionResult` | Step 3 output: all interaction data |
 | `GuidelineExcerpt` | Individual guideline citation |
 | `GuidelineRetrievalResult` | Step 4 output: relevant guidelines |
+| `ConflictType` | Enum: omission, contradiction, dosage, monitoring, allergy_risk, interaction_gap |
+| `ClinicalConflict` | Individual conflict: guideline_text vs patient_data + suggested resolution |
+| `ConflictDetectionResult` | Step 5 output: all detected conflicts |
+| `CDSReport` | Step 6 output: full synthesized report (now includes conflicts) |
 | `AgentStep` | WebSocket message: step name, status, data, timing |
 ---
 | Component | Role |
 |-----------|------|
 | `PatientInput.tsx` | Text area for patient case + 3 pre-loaded sample cases (chest pain, DKA, pediatric fever) |
+| `AgentPipeline.tsx` | Visualizes the 6-step pipeline in real time — shows status (pending / running / complete / error) for each step as WebSocket messages arrive |
+| `CDSReport.tsx` | Renders the final CDS report: patient summary, differentials, drug warnings, **conflicts & gaps** (prominently styled), guidelines, next steps |
 ### Communication
 | Characteristic | Chatbot | This Agent System |
 |----------------|---------|-------------------|
+| Tool use | None | 5+ specialized tools (parser, drug API, RAG, conflict detection, synthesis) |
+| Planning | None | Orchestrator executes a defined 6-step plan |
 | State management | Stateless | Patient context flows through all steps |
 | Error handling | Generic | Tool-specific fallbacks, graceful degradation |
 | Output structure | Free text | Pydantic-validated, structured, cited |

docs/test_results.md CHANGED Viewed

@@ -60,7 +60,7 @@ python test_rag_quality.py --rebuild --verbose
 ## 2. End-to-End Pipeline Test
 **Test file:** `src/backend/test_e2e.py`
-**What it tests:** Full 5-step agent pipeline from free-text input to synthesized CDS report.
 **Test case:** 62-year-old male with crushing substernal chest pain, diaphoresis, nausea, HTN history, on lisinopril + metformin + atorvastatin.
 ### Pipeline Step Results
@@ -71,7 +71,8 @@ python test_rag_quality.py --rebuild --verbose
 | 2. Clinical Reasoning | PASSED | 21.2 s | Top differential: Acute Coronary Syndrome (ACS). Also considered: GERD, PE, aortic dissection |
 | 3. Drug Interaction Check | PASSED | 11.3 s | Queried OpenFDA + RxNorm for lisinopril, metformin, atorvastatin interactions |
 | 4. Guideline Retrieval | PASSED | 9.6 s | Retrieved ACC/AHA chest pain / ACS guidelines from RAG corpus |
-| 5. Synthesis | PASSED | 25.3 s | Generated comprehensive CDS report with differential, warnings, guideline recommendations |
 **Total pipeline time:** 75.2 s
@@ -185,7 +186,7 @@ python test_clinical_cases.py --quiet
 | File | Lines | Purpose |
 |------|-------|---------|
-| `test_e2e.py` | 57 | Submit chest pain case, poll for completion, validate all 5 steps |
 | `test_clinical_cases.py` | ~400 | 22 clinical cases with keyword validation, CLI flags for filtering |
 | `test_rag_quality.py` | ~350 | 30 RAG retrieval queries with expected guideline IDs, relevance scoring |
 | `test_poll.py` | ~30 | Utility: poll a case ID until completion |

 ## 2. End-to-End Pipeline Test
 **Test file:** `src/backend/test_e2e.py`
+**What it tests:** Full 6-step agent pipeline from free-text input to synthesized CDS report.
 **Test case:** 62-year-old male with crushing substernal chest pain, diaphoresis, nausea, HTN history, on lisinopril + metformin + atorvastatin.
 ### Pipeline Step Results
 | 2. Clinical Reasoning | PASSED | 21.2 s | Top differential: Acute Coronary Syndrome (ACS). Also considered: GERD, PE, aortic dissection |
 | 3. Drug Interaction Check | PASSED | 11.3 s | Queried OpenFDA + RxNorm for lisinopril, metformin, atorvastatin interactions |
 | 4. Guideline Retrieval | PASSED | 9.6 s | Retrieved ACC/AHA chest pain / ACS guidelines from RAG corpus |
+| 5. Conflict Detection | PASSED | — | Compares guidelines against patient data for omissions, contradictions, dosage, monitoring gaps |
+| 6. Synthesis | PASSED | 25.3 s | Generated comprehensive CDS report with differential, warnings, conflicts, guideline recommendations |
 **Total pipeline time:** 75.2 s
 | File | Lines | Purpose |
 |------|-------|---------|
+| `test_e2e.py` | 57 | Submit chest pain case, poll for completion, validate all 6 steps |
 | `test_clinical_cases.py` | ~400 | 22 clinical cases with keyword validation, CLI flags for filtering |
 | `test_rag_quality.py` | ~350 | 30 RAG retrieval queries with expected guideline IDs, relevance scoring |
 | `test_poll.py` | ~30 | Utility: poll a case ID until completion |

docs/writeup_draft.md CHANGED Viewed

@@ -65,15 +65,16 @@ Gemma 3 27B IT provides the right balance of capability and accessibility for a
 **How the model is used:**
-The model serves as the reasoning engine in a 5-step agentic pipeline:
 1. **Patient Data Parsing** (LLM) — Extracts structured patient data from free-text clinical narratives
 2. **Clinical Reasoning** (LLM) — Generates ranked differential diagnoses with chain-of-thought reasoning
 3. **Drug Interaction Check** (External APIs) — Queries OpenFDA and RxNorm for medication safety
 4. **Guideline Retrieval** (RAG) — Retrieves relevant clinical guidelines from a 62-guideline corpus using ChromaDB
-5. **Synthesis** (LLM) — Integrates all outputs into a comprehensive CDS report
-The model is used in Steps 1, 2, and 5 — parsing, reasoning, and synthesis. This demonstrates the model used "to its fullest potential" across multiple distinct clinical tasks within a single workflow.
 ### Technical details
@@ -82,12 +83,13 @@ The model is used in Steps 1, 2, and 5 — parsing, reasoning, and synthesis. Th
 ```
 Frontend (Next.js 14)  ←→  Backend (FastAPI + Python 3.10)
                               │
-                    Orchestrator (5-step pipeline)
                     ├── Step 1: Patient Parser (LLM)
                     ├── Step 2: Clinical Reasoning (LLM)
                     ├── Step 3: Drug Check (OpenFDA + RxNorm APIs)
                     ├── Step 4: Guideline Retrieval (ChromaDB RAG)
-                    └── Step 5: Synthesis (LLM)
 ```
 All inter-step data is strongly typed with Pydantic v2 models. The pipeline streams each step's progress to the frontend via WebSocket for real-time visibility.
@@ -100,7 +102,7 @@ No fine-tuning was performed in the current version. The base `gemma-3-27b-it` m
 | Test | Result |
 |------|--------|
-| E2E pipeline (chest pain / ACS) | All 5 steps passed, 75 s total |
 | RAG retrieval quality | 30/30 queries passed (100%), avg relevance 0.639 |
 | Clinical test suite | 22 scenarios across 14 specialties |
 | Top-1 RAG accuracy | 100% — correct guideline ranked #1 for all queries |
@@ -127,10 +129,11 @@ No fine-tuning was performed in the current version. The base `gemma-3-27b-it` m
 In a real clinical setting, the system would be used at the point of care:
 1. Clinician opens the CDS Agent interface (embedded in the EHR or as a standalone app)
 2. Patient data is automatically pulled from the EHR (or pasted manually)
-3. The agent pipeline runs in ~60-90 seconds, during which the clinician can continue other tasks
 4. The CDS report appears with:
    - Ranked differential diagnoses with reasoning chains (transparent AI)
    - Drug interaction warnings with severity levels
    - Relevant clinical guideline excerpts with citations to authoritative sources
    - Suggested next steps (immediate, short-term, long-term)
 5. The clinician reviews the recommendations and incorporates them into their clinical judgment

 **How the model is used:**
+The model serves as the reasoning engine in a 6-step agentic pipeline:
 1. **Patient Data Parsing** (LLM) — Extracts structured patient data from free-text clinical narratives
 2. **Clinical Reasoning** (LLM) — Generates ranked differential diagnoses with chain-of-thought reasoning
 3. **Drug Interaction Check** (External APIs) — Queries OpenFDA and RxNorm for medication safety
 4. **Guideline Retrieval** (RAG) — Retrieves relevant clinical guidelines from a 62-guideline corpus using ChromaDB
+5. **Conflict Detection** (LLM) — Compares guideline recommendations against patient data to identify omissions, contradictions, dosage concerns, monitoring gaps, allergy risks, and interaction gaps
+6. **Synthesis** (LLM) — Integrates all outputs into a comprehensive CDS report with conflicts prominently featured
+The model is used in Steps 1, 2, 5, and 6 — parsing, reasoning, conflict detection, and synthesis. This demonstrates the model used "to its fullest potential" across multiple distinct clinical tasks within a single workflow.
 ### Technical details
 ```
 Frontend (Next.js 14)  ←→  Backend (FastAPI + Python 3.10)
                               │
+                    Orchestrator (6-step pipeline)
                     ├── Step 1: Patient Parser (LLM)
                     ├── Step 2: Clinical Reasoning (LLM)
                     ├── Step 3: Drug Check (OpenFDA + RxNorm APIs)
                     ├── Step 4: Guideline Retrieval (ChromaDB RAG)
+                    ├── Step 5: Conflict Detection (LLM)
+                    └── Step 6: Synthesis (LLM)
 ```
 All inter-step data is strongly typed with Pydantic v2 models. The pipeline streams each step's progress to the frontend via WebSocket for real-time visibility.
 | Test | Result |
 |------|--------|
+| E2E pipeline (chest pain / ACS) | All 6 steps passed, ~75–85 s total |
 | RAG retrieval quality | 30/30 queries passed (100%), avg relevance 0.639 |
 | Clinical test suite | 22 scenarios across 14 specialties |
 | Top-1 RAG accuracy | 100% — correct guideline ranked #1 for all queries |
 In a real clinical setting, the system would be used at the point of care:
 1. Clinician opens the CDS Agent interface (embedded in the EHR or as a standalone app)
 2. Patient data is automatically pulled from the EHR (or pasted manually)
+3. The agent pipeline runs in ~60–90 seconds, during which the clinician can continue other tasks
 4. The CDS report appears with:
    - Ranked differential diagnoses with reasoning chains (transparent AI)
    - Drug interaction warnings with severity levels
+   - **Conflicts & gaps** between guideline recommendations and the patient's actual data — prominently displayed with specific guideline citations, patient data comparisons, and suggested resolutions
    - Relevant clinical guideline excerpts with citations to authoritative sources
    - Suggested next steps (immediate, short-term, long-term)
 5. The clinician reviews the recommendations and incorporates them into their clinical judgment