AMR-Guard / README.md
ghitaben's picture
hardware: zero-gpu
6efd8e4
---
title: AMR-Guard
emoji: ⚕️
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: "5.25.0"
python_version: "3.12"
hardware: zero-gpu
pinned: true
license: apache-2.0
tags:
- healthcare
- medical
- antimicrobial-resistance
- clinical-decision-support
- streamlit
- llm
- medgemma
short_description: Multi-agent clinical support for antimicrobial stewardship
---
# AMR-Guard: Infection Lifecycle Orchestrator
A multi-agent clinical decision-support system for antimicrobial stewardship, submitted to the **[MedGemma Impact Challenge](https://www.kaggle.com/competitions/med-gemma-impact-challenge)**.
Powered by **MedGemma** (4B multimodal + 27B text) and **TxGemma** — HAI-DEF models from Google.
---
## What it does
AMR-Guard guides clinicians through two stages of infection management with a **dynamic, site-aware patient form** that adapts its fields based on the selected infection site.
**Stage 1 — Empirical** (no lab results yet)
Patient history → risk factor analysis → empirical antibiotic recommendation
**Stage 2 — Targeted** (lab results available)
Lab report upload (PDF / image, any language) → pathogen & MIC extraction → resistance trend analysis → targeted prescription with drug interaction screening
A unique capability is **MIC creep detection**: the system flags when a pathogen's Minimum Inhibitory Concentration has risen ≥4-fold across admissions — even while the lab still reports "Susceptible" — giving clinicians a 6–18 month early warning before formal treatment failure.
---
## Key Features
| Feature | Details |
|---------|---------|
| **Dynamic form** | Fields adapt to infection site (urinary, respiratory, bloodstream, skin, intra-abdominal, CNS) |
| **Contextual suspected source** | Dropdown options change based on infection site (e.g. CAP / HAP / VAP for respiratory) |
| **Conditional creatinine** | Shown prominently for systemic infections; optional toggle for skin / intra-abdominal |
| **Lab file upload** | Upload PDF or image (PNG/JPG/TIFF) — PDF text extracted via pypdf; images sent to MedGemma vision |
| **MIC creep detection** | ≥4-fold MIC rise flagged before clinical resistance develops |
| **WHO AWaRe stewardship** | ACCESS → WATCH → RESERVE prescribing hierarchy enforced |
| **Drug interaction screening** | 191 K+ interactions from DDInter 2.0 |
| **Renal dose adjustment** | Cockcroft-Gault CrCl → 5-tier dose adjustment |
---
## Agent Pipeline
```
Patient form ──► Agent 1: Intake Historian ──► (no lab) ──────────────────────────────► Agent 4: Clinical Pharmacologist ──► Prescription
│ ▲
└──► (lab uploaded) ──► Agent 2: Vision Specialist ──► Agent 3: Trend Analyst ──┘
```
| # | Agent | Model | Role |
|---|-------|-------|------|
| 1 | Intake Historian | MedGemma 4B IT | Parse EHR notes, calculate CrCl (Cockcroft-Gault), identify MDR risk factors |
| 2 | Vision Specialist | MedGemma 4B IT (multimodal) | Extract pathogen names + MIC values from lab images / PDFs in **any language** |
| 3 | Trend Analyst | MedGemma 27B Text IT | Detect MIC creep, compute resistance velocity against EUCAST v16.0 breakpoints |
| 4 | Clinical Pharmacologist | MedGemma 4B IT + TxGemma 9B | Select antibiotic + dose, apply WHO AWaRe stewardship, screen drug interactions |
**Orchestration:** LangGraph state machine with conditional routing
**Knowledge base:** SQLite (EUCAST breakpoints, WHO AWaRe, ATLAS surveillance, DDInter interactions) + ChromaDB (IDSA guidelines, WHO GLASS — semantic RAG)
---
## Hugging Face Spaces Deployment
> **Recommended deployment target.** Provides a persistent URL, native Streamlit support, GPU access, and multi-user access out of the box.
### Requirements
- A HF Space with **GPU hardware** (T4 for MedGemma 4B; A10G or better for MedGemma 27B)
- HF access granted to [MedGemma](https://huggingface.co/google/medgemma-4b-it) and [TxGemma](https://huggingface.co/google/txgemma-2b-predict)
### Steps
**1. Create a new Space**
Go to [huggingface.co/new-space](https://huggingface.co/new-space) and select:
- SDK: **Streamlit**
- Hardware: **T4 (GPU)** (free tier, limited quota) or **A10G**
**2. Push this repository**
```bash
git remote set-url space https://huggingface.co/spaces/<your-username>/amr-guard 2>/dev/null || git remote add space https://huggingface.co/spaces/<your-username>/amr-guard
git push space master:main
```
**3. Add Space Secrets**
In your Space → Settings → Variables and Secrets, add:
| Secret name | Value | Notes |
|-------------|-------|-------|
| `MEDIC_LOCAL_MEDGEMMA_4B_MODEL` | `google/medgemma-4b-it` | Required |
| `MEDIC_LOCAL_MEDGEMMA_27B_MODEL` | `google/medgemma-4b-it` | Use 4B fallback on T4 |
| `MEDIC_LOCAL_TXGEMMA_9B_MODEL` | `google/txgemma-2b-predict` | Required |
| `MEDIC_LOCAL_TXGEMMA_2B_MODEL` | `google/txgemma-2b-predict` | Required |
| `MEDIC_QUANTIZATION` | `4bit` | Required |
| `MEDIC_ENV` | `production` | Required |
| `HF_TOKEN` | Your HF access token | Required (gated models) |
**4. First boot — knowledge base initialisation**
`app.py` detects the HF Spaces environment (`SPACE_ID` env var) and automatically runs `setup_demo.py` on first boot to build the SQLite + ChromaDB knowledge base. This takes ~2–5 minutes once and requires no manual steps.
> Enable **Persistent Storage** (Space Settings → Persistent Storage) so the knowledge base survives restarts. Without it, setup runs on every cold boot (~2 min overhead).
**5. Drug interaction dataset (optional)**
To enable full drug interaction screening, place `db_drug_interactions.csv` in `docs/drug_safety/` before pushing, or after deployment open the Space terminal and run:
```bash
kaggle datasets download -d mghobashy/drug-drug-interactions --unzip -p docs/drug_safety/
python setup_demo.py
```
---
## Local Setup
### Requirements
- Python 3.11+
- [`uv`](https://docs.astral.sh/uv/) for dependency management
- HuggingFace account with access to MedGemma and TxGemma
### 1. Install dependencies
```bash
uv sync
```
### 2. Configure environment
```bash
cp .env.example .env
```
Minimum required settings in `.env`:
```bash
MEDIC_LOCAL_MEDGEMMA_4B_MODEL=google/medgemma-4b-it
MEDIC_LOCAL_MEDGEMMA_27B_MODEL=google/medgemma-4b-it # 4B fallback if < 24 GB VRAM
MEDIC_LOCAL_TXGEMMA_9B_MODEL=google/txgemma-2b-predict
MEDIC_LOCAL_TXGEMMA_2B_MODEL=google/txgemma-2b-predict
MEDIC_QUANTIZATION=4bit
```
### 3. Authenticate with HuggingFace
```bash
uv run huggingface-cli login
```
### 4. Build the knowledge base
```bash
uv run python setup_demo.py
```
Ingests EUCAST breakpoints, WHO AWaRe classification, IDSA guidelines, ATLAS surveillance data, and DDInter drug interactions into SQLite + ChromaDB. Source files are in `docs/` — generated database is written to `data/` (gitignored).
### 5. Run the app
```bash
uv run streamlit run app.py
```
Open `http://localhost:8501` in your browser.
---
## Kaggle Reproduction
The full pipeline can also be reproduced on a free Kaggle T4 GPU (16 GB VRAM):
1. Open [`notebooks/kaggle_medic_demo.ipynb`](notebooks/kaggle_medic_demo.ipynb) in Kaggle
2. Add the `mghobashy/drug-drug-interactions` dataset to the notebook
3. Add your HuggingFace token as a Kaggle secret named `HF_TOKEN`
4. Run all cells — the notebook clones this repo, installs dependencies, builds the knowledge base, and launches the app via a public tunnel
Models run with 4-bit quantization on T4 (MedGemma 4B + TxGemma 2B).
---
## Dynamic Form — Field Reference
The Patient Analysis form adapts based on the selected **Primary infection site**.
| Infection site | Site-specific fields | Creatinine |
|---|---|---|
| **Urinary** | Catheter status, urinary symptoms, urine appearance | Always shown |
| **Respiratory** | O₂ saturation, ventilation status, cough type, sputum character | Always shown |
| **Bloodstream** | Central line, temperature, heart rate, respiratory rate, WBC, lactate, shock status | Always shown |
| **Skin** | Wound type, cellulitis extent, abscess, foreign body | Optional (renal flag) |
| **Intra-abdominal** | Pain location, peritonitis signs, perforation suspected, ascites | Optional (renal flag) |
| **CNS** | CSF obtained, neurological symptoms, recent neurosurgery, GCS score | Always shown |
| **Other** | No site-specific fields | Optional (renal flag) |
The **Suspected source** dropdown adapts contextually (e.g., respiratory → CAP / HAP / VAP / Aspiration / ...).
**Lab / Culture Results** accepts three input modes:
- **None** — empirical pathway only
- **Upload file** — PDF (text extracted via pypdf) or image (PNG/JPG/TIFF sent to MedGemma vision)
- **Paste text** — manual copy-paste from a lab system
---
## Knowledge Base Sources
All data is open-access — no registration required except where noted.
| Source | Contents | Used for |
|--------|----------|---------|
| [EUCAST v16.0](https://www.eucast.org/bacteria/clinical-breakpoints-and-interpretation/) | Clinical breakpoint tables | MIC interpretation, creep detection |
| [WHO AWaRe 2024](https://aware.essentialmeds.org) | Access / Watch / Reserve classification | Antibiotic stewardship |
| [IDSA AMR Guidance 2024](https://www.idsociety.org/practice-guideline/amr-guidance/) | Treatment guidelines PDF | Empirical therapy RAG |
| [Pfizer ATLAS](https://atlas-surveillance.com) *(free registration)* | 6.5M MIC surveillance measurements | Resistance patterns RAG |
| [WHO GLASS](https://worldhealthorg.shinyapps.io/glass-dashboard/) | 23M+ AMR episodes, 141 countries | Global resistance context |
| [DDInter 2.0](https://ddinter2.scbdd.com) | 191,000+ drug-drug interactions | Interaction screening |
| [OpenFDA](https://api.fda.gov/drug/label.json) | Drug labeling / contraindications | Safety RAG |
---
## Project Structure
```
amr-guard/
├── app.py # Streamlit UI — all four pages
├── setup_demo.py # One-command knowledge base setup
├── requirements.txt # pip requirements (HF Spaces / CI)
├── packages.txt # apt system packages (HF Spaces)
├── pyproject.toml # Full dependency spec (managed by uv)
├── .env.example # Environment variable template
├── src/
│ ├── agents.py # Four agent implementations
│ ├── form_config.py # Dynamic form field definitions per infection site
│ ├── graph.py # LangGraph orchestrator + conditional routing
│ ├── loader.py # Model loading: multimodal + causal LM + vision inference
│ ├── prompts.py # System and user prompts for all agents
│ ├── rag.py # ChromaDB ingestion and retrieval helpers
│ ├── state.py # InfectionState TypedDict schema
│ ├── utils.py # CrCl calculator, MIC creep detection
│ ├── config.py # Pydantic settings (reads from .env / Space Secrets)
│ ├── tools/
│ │ ├── antibiotic_tools.py # WHO AWaRe lookups, MIC interpretation
│ │ ├── resistance_tools.py # Pathogen resistance pattern queries
│ │ ├── safety_tools.py # Drug interaction screening
│ │ └── rag_tools.py # Guideline retrieval wrappers
│ └── db/
│ ├── schema.sql # SQLite table definitions
│ ├── database.py # Connection and query helpers
│ ├── import_data.py # ETL: Excel/CSV/PDF → SQLite
│ └── vector_store.py # ChromaDB ingestion
├── docs/ # Source data files (committed — used by setup_demo.py)
│ ├── antibiotic_guidelines/ # WHO AWaRe Excel exports, IDSA PDF
│ ├── mic_breakpoints/ # EUCAST v16.0 breakpoint tables
│ ├── pathogen_resistance/ # ATLAS susceptibility data
│ └── drug_safety/ # DDInter drug interaction CSV
├── notebooks/
│ └── kaggle_medic_demo.ipynb # Full reproducible Kaggle notebook
└── tests/
└── test_pipeline.py # Agent and pipeline unit tests
```
---
> **Research demo only.** Not validated for clinical use. All recommendations must be reviewed by a licensed clinician before any patient-care decision.