Spaces:
Running
Running
Nikhil Pravin Pise commited on
Commit ·
9699bea
1
Parent(s): 1e732dd
Deploy to HuggingFace Spaces - Medical RAG with vector store
Browse files- .env.example +2 -2
- .gitattributes +2 -0
- .gitignore +5 -2
- DEPLOY_HUGGINGFACE.md +203 -0
- Dockerfile +39 -39
- README.md +19 -0
- alembic.ini +149 -0
- alembic/README +1 -0
- alembic/env.py +95 -0
- alembic/script.py.mako +28 -0
- data/vector_stores/medical_knowledge.faiss +3 -0
- data/vector_stores/medical_knowledge.pkl +3 -0
- docker-compose.yml +17 -15
- huggingface/Dockerfile +66 -0
- huggingface/README.md +111 -0
- huggingface/app.py +532 -0
- huggingface/requirements.txt +38 -0
- scripts/deploy_huggingface.ps1 +139 -0
.env.example
CHANGED
|
@@ -32,9 +32,9 @@ OLLAMA__MODEL=llama3.2
|
|
| 32 |
|
| 33 |
# --- LLM (Groq / Gemini — existing providers) ---
|
| 34 |
LLM__PRIMARY_PROVIDER=groq
|
| 35 |
-
LLM__GROQ_API_KEY=
|
| 36 |
LLM__GROQ_MODEL=llama-3.3-70b-versatile
|
| 37 |
-
LLM__GEMINI_API_KEY=
|
| 38 |
LLM__GEMINI_MODEL=gemini-2.0-flash
|
| 39 |
|
| 40 |
# --- Embeddings ---
|
|
|
|
| 32 |
|
| 33 |
# --- LLM (Groq / Gemini — existing providers) ---
|
| 34 |
LLM__PRIMARY_PROVIDER=groq
|
| 35 |
+
LLM__GROQ_API_KEY=
|
| 36 |
LLM__GROQ_MODEL=llama-3.3-70b-versatile
|
| 37 |
+
LLM__GEMINI_API_KEY=
|
| 38 |
LLM__GEMINI_MODEL=gemini-2.0-flash
|
| 39 |
|
| 40 |
# --- Embeddings ---
|
.gitattributes
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.faiss filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
.gitignore
CHANGED
|
@@ -221,10 +221,13 @@ $RECYCLE.BIN/
|
|
| 221 |
# Project Specific
|
| 222 |
# ==============================================================================
|
| 223 |
# Vector stores (large files, regenerate locally)
|
|
|
|
| 224 |
data/vector_stores/*.faiss
|
| 225 |
data/vector_stores/*.pkl
|
| 226 |
-
|
| 227 |
-
|
|
|
|
|
|
|
| 228 |
|
| 229 |
# Medical PDFs (proprietary/large)
|
| 230 |
data/medical_pdfs/*.pdf
|
|
|
|
| 221 |
# Project Specific
|
| 222 |
# ==============================================================================
|
| 223 |
# Vector stores (large files, regenerate locally)
|
| 224 |
+
# BUT allow medical_knowledge for HuggingFace deployment
|
| 225 |
data/vector_stores/*.faiss
|
| 226 |
data/vector_stores/*.pkl
|
| 227 |
+
!data/vector_stores/medical_knowledge.faiss
|
| 228 |
+
!data/vector_stores/medical_knowledge.pkl
|
| 229 |
+
# *.faiss # Commented out to allow medical_knowledge
|
| 230 |
+
# *.pkl # Commented out to allow medical_knowledge
|
| 231 |
|
| 232 |
# Medical PDFs (proprietary/large)
|
| 233 |
data/medical_pdfs/*.pdf
|
DEPLOY_HUGGINGFACE.md
ADDED
|
@@ -0,0 +1,203 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 🚀 Deploy MediGuard AI to Hugging Face Spaces
|
| 2 |
+
|
| 3 |
+
This guide walks you through deploying MediGuard AI to Hugging Face Spaces using Docker.
|
| 4 |
+
|
| 5 |
+
## Prerequisites
|
| 6 |
+
|
| 7 |
+
1. **Hugging Face Account** — [Sign up free](https://huggingface.co/join)
|
| 8 |
+
2. **Git** — Installed on your machine
|
| 9 |
+
3. **API Key** — Either:
|
| 10 |
+
- **Groq** (recommended) — [Get free key](https://console.groq.com/keys)
|
| 11 |
+
- **Google Gemini** — [Get free key](https://aistudio.google.com/app/apikey)
|
| 12 |
+
|
| 13 |
+
## Step 1: Create a New Space
|
| 14 |
+
|
| 15 |
+
1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
|
| 16 |
+
2. Fill in:
|
| 17 |
+
- **Space name**: `mediguard-ai` (or your choice)
|
| 18 |
+
- **License**: MIT
|
| 19 |
+
- **SDK**: Select **Docker**
|
| 20 |
+
- **Hardware**: **CPU Basic** (free tier works!)
|
| 21 |
+
3. Click **Create Space**
|
| 22 |
+
|
| 23 |
+
## Step 2: Clone Your Space
|
| 24 |
+
|
| 25 |
+
```bash
|
| 26 |
+
# Clone the empty space
|
| 27 |
+
git clone https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
|
| 28 |
+
cd mediguard-ai
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
## Step 3: Copy Project Files
|
| 32 |
+
|
| 33 |
+
Copy all files from this repository to your space folder:
|
| 34 |
+
|
| 35 |
+
```bash
|
| 36 |
+
# Option A: If you have the RagBot repo locally
|
| 37 |
+
cp -r /path/to/RagBot/* .
|
| 38 |
+
|
| 39 |
+
# Option B: Clone fresh
|
| 40 |
+
git clone https://github.com/yourusername/ragbot temp
|
| 41 |
+
cp -r temp/* .
|
| 42 |
+
rm -rf temp
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
## Step 4: Set Up Dockerfile for Spaces
|
| 46 |
+
|
| 47 |
+
Hugging Face Spaces expects the Dockerfile in the root. Copy the HF-optimized Dockerfile:
|
| 48 |
+
|
| 49 |
+
```bash
|
| 50 |
+
# Copy the HF Spaces Dockerfile to root
|
| 51 |
+
cp huggingface/Dockerfile ./Dockerfile
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
**Or** update your root `Dockerfile` to match the HF Spaces version.
|
| 55 |
+
|
| 56 |
+
## Step 5: Set Up README (Important!)
|
| 57 |
+
|
| 58 |
+
The README.md must have the HF Spaces metadata header. Copy the HF README:
|
| 59 |
+
|
| 60 |
+
```bash
|
| 61 |
+
# Backup original README
|
| 62 |
+
mv README.md README_original.md
|
| 63 |
+
|
| 64 |
+
# Use HF Spaces README
|
| 65 |
+
cp huggingface/README.md ./README.md
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
## Step 6: Add Your API Key (Secret)
|
| 69 |
+
|
| 70 |
+
1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
|
| 71 |
+
2. Click **Settings** tab
|
| 72 |
+
3. Scroll to **Repository Secrets**
|
| 73 |
+
4. Add a new secret:
|
| 74 |
+
- **Name**: `GROQ_API_KEY` (or `GOOGLE_API_KEY`)
|
| 75 |
+
- **Value**: Your API key
|
| 76 |
+
5. Click **Add**
|
| 77 |
+
|
| 78 |
+
## Step 7: Push to Deploy
|
| 79 |
+
|
| 80 |
+
```bash
|
| 81 |
+
# Add all files
|
| 82 |
+
git add .
|
| 83 |
+
|
| 84 |
+
# Commit
|
| 85 |
+
git commit -m "Deploy MediGuard AI"
|
| 86 |
+
|
| 87 |
+
# Push to Hugging Face
|
| 88 |
+
git push
|
| 89 |
+
```
|
| 90 |
+
|
| 91 |
+
## Step 8: Monitor Deployment
|
| 92 |
+
|
| 93 |
+
1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
|
| 94 |
+
2. Click the **Logs** tab to watch the build
|
| 95 |
+
3. Build takes ~5-10 minutes (first time)
|
| 96 |
+
4. Once "Running", your app is live! 🎉
|
| 97 |
+
|
| 98 |
+
## 🔧 Troubleshooting
|
| 99 |
+
|
| 100 |
+
### "No LLM API key configured"
|
| 101 |
+
|
| 102 |
+
- Make sure you added `GROQ_API_KEY` or `GOOGLE_API_KEY` in Space Settings → Secrets
|
| 103 |
+
- Secret names are case-sensitive
|
| 104 |
+
|
| 105 |
+
### Build fails with "No space disk"
|
| 106 |
+
|
| 107 |
+
- Hugging Face free tier has limited disk space
|
| 108 |
+
- The FAISS vector store might be too large
|
| 109 |
+
- Solution: Upgrade to a paid tier or reduce vector store size
|
| 110 |
+
|
| 111 |
+
### "ModuleNotFoundError"
|
| 112 |
+
|
| 113 |
+
- Check that all dependencies are in `huggingface/requirements.txt`
|
| 114 |
+
- The Dockerfile should install from this file
|
| 115 |
+
|
| 116 |
+
### App crashes on startup
|
| 117 |
+
|
| 118 |
+
- Check Logs for the actual error
|
| 119 |
+
- Common issue: Missing environment variables
|
| 120 |
+
- Increase Space hardware if OOM error
|
| 121 |
+
|
| 122 |
+
## 📁 File Structure for Deployment
|
| 123 |
+
|
| 124 |
+
Your Space should have this structure:
|
| 125 |
+
|
| 126 |
+
```
|
| 127 |
+
your-space/
|
| 128 |
+
├── Dockerfile # HF Spaces Dockerfile (from huggingface/)
|
| 129 |
+
├── README.md # HF Spaces README with metadata
|
| 130 |
+
├── huggingface/
|
| 131 |
+
│ ├── app.py # Standalone Gradio app
|
| 132 |
+
│ ├── requirements.txt # Minimal deps for HF
|
| 133 |
+
│ └── README.md # Original HF README
|
| 134 |
+
├── src/ # Core application code
|
| 135 |
+
│ ├── workflow.py
|
| 136 |
+
│ ├── state.py
|
| 137 |
+
│ ├── llm_config.py
|
| 138 |
+
│ ├── pdf_processor.py
|
| 139 |
+
│ ├── agents/
|
| 140 |
+
│ └── ...
|
| 141 |
+
├── data/
|
| 142 |
+
│ └── vector_stores/
|
| 143 |
+
│ ├── medical_knowledge.faiss
|
| 144 |
+
│ └── medical_knowledge.pkl
|
| 145 |
+
└── config/
|
| 146 |
+
└── biomarker_references.json
|
| 147 |
+
```
|
| 148 |
+
|
| 149 |
+
## 🔄 Updating Your Space
|
| 150 |
+
|
| 151 |
+
To update after making changes:
|
| 152 |
+
|
| 153 |
+
```bash
|
| 154 |
+
git add .
|
| 155 |
+
git commit -m "Update: description of changes"
|
| 156 |
+
git push
|
| 157 |
+
```
|
| 158 |
+
|
| 159 |
+
Hugging Face will automatically rebuild and redeploy.
|
| 160 |
+
|
| 161 |
+
## 💰 Hardware Options
|
| 162 |
+
|
| 163 |
+
| Tier | RAM | vCPU | Cost | Best For |
|
| 164 |
+
|------|-----|------|------|----------|
|
| 165 |
+
| CPU Basic | 2GB | 2 | Free | Demo/Testing |
|
| 166 |
+
| CPU Upgrade | 8GB | 4 | ~$0.03/hr | Production |
|
| 167 |
+
| T4 Small | 16GB | 4 | ~$0.06/hr | Heavy usage |
|
| 168 |
+
|
| 169 |
+
The free tier works for demos. Upgrade if you experience timeouts.
|
| 170 |
+
|
| 171 |
+
## 🎉 Your Space is Live!
|
| 172 |
+
|
| 173 |
+
Once deployed, share your Space URL:
|
| 174 |
+
|
| 175 |
+
```
|
| 176 |
+
https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
|
| 177 |
+
```
|
| 178 |
+
|
| 179 |
+
Anyone can now use MediGuard AI without any setup!
|
| 180 |
+
|
| 181 |
+
---
|
| 182 |
+
|
| 183 |
+
## Quick Commands Reference
|
| 184 |
+
|
| 185 |
+
```bash
|
| 186 |
+
# Clone your space
|
| 187 |
+
git clone https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
|
| 188 |
+
|
| 189 |
+
# Set up remote (if needed)
|
| 190 |
+
git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
|
| 191 |
+
|
| 192 |
+
# Push changes
|
| 193 |
+
git push origin main
|
| 194 |
+
|
| 195 |
+
# Force rebuild (if stuck)
|
| 196 |
+
# Go to Settings → Factory Reset
|
| 197 |
+
```
|
| 198 |
+
|
| 199 |
+
## Need Help?
|
| 200 |
+
|
| 201 |
+
- [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces)
|
| 202 |
+
- [Docker on Spaces](https://huggingface.co/docs/hub/spaces-sdks-docker)
|
| 203 |
+
- [Spaces Secrets](https://huggingface.co/docs/hub/spaces-secrets)
|
Dockerfile
CHANGED
|
@@ -1,19 +1,27 @@
|
|
| 1 |
# ===========================================================================
|
| 2 |
-
# MediGuard AI —
|
| 3 |
# ===========================================================================
|
| 4 |
-
#
|
| 5 |
-
#
|
| 6 |
-
# production — slim runtime image
|
| 7 |
# ===========================================================================
|
| 8 |
|
| 9 |
-
|
| 10 |
-
# Stage 1: base
|
| 11 |
-
# ---------------------------------------------------------------------------
|
| 12 |
-
FROM python:3.11-slim AS base
|
| 13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
ENV PYTHONDONTWRITEBYTECODE=1 \
|
| 15 |
PYTHONUNBUFFERED=1 \
|
| 16 |
-
PIP_NO_CACHE_DIR=1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
WORKDIR /app
|
| 19 |
|
|
@@ -22,45 +30,37 @@ RUN apt-get update && \
|
|
| 22 |
apt-get install -y --no-install-recommends \
|
| 23 |
build-essential \
|
| 24 |
curl \
|
|
|
|
| 25 |
&& rm -rf /var/lib/apt/lists/*
|
| 26 |
|
| 27 |
-
#
|
| 28 |
-
COPY
|
| 29 |
RUN pip install --upgrade pip && \
|
| 30 |
-
pip install
|
| 31 |
-
|
| 32 |
-
# ---------------------------------------------------------------------------
|
| 33 |
-
# Stage 2: production
|
| 34 |
-
# ---------------------------------------------------------------------------
|
| 35 |
-
FROM python:3.11-slim AS production
|
| 36 |
-
|
| 37 |
-
ENV PYTHONDONTWRITEBYTECODE=1 \
|
| 38 |
-
PYTHONUNBUFFERED=1
|
| 39 |
|
| 40 |
-
|
|
|
|
| 41 |
|
| 42 |
-
#
|
| 43 |
-
|
| 44 |
-
COPY --from=base /usr/local/bin /usr/local/bin
|
| 45 |
|
| 46 |
-
#
|
| 47 |
-
|
| 48 |
|
| 49 |
-
#
|
| 50 |
-
RUN
|
| 51 |
-
apt-get install -y --no-install-recommends curl && \
|
| 52 |
-
rm -rf /var/lib/apt/lists/*
|
| 53 |
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
chown -R mediguard:mediguard /app
|
| 58 |
|
| 59 |
-
|
| 60 |
|
| 61 |
-
EXPOSE
|
| 62 |
|
| 63 |
-
|
| 64 |
-
|
|
|
|
| 65 |
|
| 66 |
-
|
|
|
|
|
|
| 1 |
# ===========================================================================
|
| 2 |
+
# MediGuard AI — Hugging Face Spaces Dockerfile
|
| 3 |
# ===========================================================================
|
| 4 |
+
# Optimized single-container deployment for Hugging Face Spaces.
|
| 5 |
+
# Uses FAISS vector store + Cloud LLMs (Groq/Gemini) - no external services.
|
|
|
|
| 6 |
# ===========================================================================
|
| 7 |
|
| 8 |
+
FROM python:3.11-slim
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
+
# Non-interactive apt
|
| 11 |
+
ENV DEBIAN_FRONTEND=noninteractive
|
| 12 |
+
|
| 13 |
+
# Python settings
|
| 14 |
ENV PYTHONDONTWRITEBYTECODE=1 \
|
| 15 |
PYTHONUNBUFFERED=1 \
|
| 16 |
+
PIP_NO_CACHE_DIR=1 \
|
| 17 |
+
PIP_DISABLE_PIP_VERSION_CHECK=1
|
| 18 |
+
|
| 19 |
+
# HuggingFace Spaces runs on port 7860
|
| 20 |
+
ENV GRADIO_SERVER_NAME="0.0.0.0" \
|
| 21 |
+
GRADIO_SERVER_PORT=7860
|
| 22 |
+
|
| 23 |
+
# Default to HuggingFace embeddings (local, no API key needed)
|
| 24 |
+
ENV EMBEDDING_PROVIDER=huggingface
|
| 25 |
|
| 26 |
WORKDIR /app
|
| 27 |
|
|
|
|
| 30 |
apt-get install -y --no-install-recommends \
|
| 31 |
build-essential \
|
| 32 |
curl \
|
| 33 |
+
git \
|
| 34 |
&& rm -rf /var/lib/apt/lists/*
|
| 35 |
|
| 36 |
+
# Copy requirements first (cache layer)
|
| 37 |
+
COPY huggingface/requirements.txt ./requirements.txt
|
| 38 |
RUN pip install --upgrade pip && \
|
| 39 |
+
pip install -r requirements.txt
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
+
# Copy the entire project
|
| 42 |
+
COPY . .
|
| 43 |
|
| 44 |
+
# Create necessary directories and ensure vector store exists
|
| 45 |
+
RUN mkdir -p data/medical_pdfs data/vector_stores data/chat_reports
|
|
|
|
| 46 |
|
| 47 |
+
# Create non-root user (HF Spaces requirement)
|
| 48 |
+
RUN useradd -m -u 1000 user
|
| 49 |
|
| 50 |
+
# Make app writable by user
|
| 51 |
+
RUN chown -R user:user /app
|
|
|
|
|
|
|
| 52 |
|
| 53 |
+
USER user
|
| 54 |
+
ENV HOME=/home/user \
|
| 55 |
+
PATH=/home/user/.local/bin:$PATH
|
|
|
|
| 56 |
|
| 57 |
+
WORKDIR /app
|
| 58 |
|
| 59 |
+
EXPOSE 7860
|
| 60 |
|
| 61 |
+
# Health check
|
| 62 |
+
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
|
| 63 |
+
CMD curl -sf http://localhost:7860/ || exit 1
|
| 64 |
|
| 65 |
+
# Launch Gradio app
|
| 66 |
+
CMD ["python", "huggingface/app.py"]
|
README.md
CHANGED
|
@@ -1,3 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# RagBot: Multi-Agent RAG System for Medical Biomarker Analysis
|
| 2 |
|
| 3 |
A production-ready biomarker analysis system combining 6 specialized AI agents with medical knowledge retrieval to provide evidence-based insights on blood test results in **15-25 seconds**.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Agentic RagBot
|
| 3 |
+
emoji: 🏥
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: indigo
|
| 6 |
+
sdk: docker
|
| 7 |
+
pinned: true
|
| 8 |
+
license: mit
|
| 9 |
+
app_port: 7860
|
| 10 |
+
tags:
|
| 11 |
+
- medical
|
| 12 |
+
- biomarker
|
| 13 |
+
- rag
|
| 14 |
+
- healthcare
|
| 15 |
+
- langgraph
|
| 16 |
+
- agents
|
| 17 |
+
short_description: Multi-Agent RAG System for Medical Biomarker Analysis
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
# RagBot: Multi-Agent RAG System for Medical Biomarker Analysis
|
| 21 |
|
| 22 |
A production-ready biomarker analysis system combining 6 specialized AI agents with medical knowledge retrieval to provide evidence-based insights on blood test results in **15-25 seconds**.
|
alembic.ini
ADDED
|
@@ -0,0 +1,149 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# A generic, single database configuration.
|
| 2 |
+
|
| 3 |
+
[alembic]
|
| 4 |
+
# path to migration scripts.
|
| 5 |
+
# this is typically a path given in POSIX (e.g. forward slashes)
|
| 6 |
+
# format, relative to the token %(here)s which refers to the location of this
|
| 7 |
+
# ini file
|
| 8 |
+
script_location = %(here)s/alembic
|
| 9 |
+
|
| 10 |
+
# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
|
| 11 |
+
# Uncomment the line below if you want the files to be prepended with date and time
|
| 12 |
+
# see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
|
| 13 |
+
# for all available tokens
|
| 14 |
+
# file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
|
| 15 |
+
# Or organize into date-based subdirectories (requires recursive_version_locations = true)
|
| 16 |
+
# file_template = %%(year)d/%%(month).2d/%%(day).2d_%%(hour).2d%%(minute).2d_%%(second).2d_%%(rev)s_%%(slug)s
|
| 17 |
+
|
| 18 |
+
# sys.path path, will be prepended to sys.path if present.
|
| 19 |
+
# defaults to the current working directory. for multiple paths, the path separator
|
| 20 |
+
# is defined by "path_separator" below.
|
| 21 |
+
prepend_sys_path = .
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
# timezone to use when rendering the date within the migration file
|
| 25 |
+
# as well as the filename.
|
| 26 |
+
# If specified, requires the tzdata library which can be installed by adding
|
| 27 |
+
# `alembic[tz]` to the pip requirements.
|
| 28 |
+
# string value is passed to ZoneInfo()
|
| 29 |
+
# leave blank for localtime
|
| 30 |
+
# timezone =
|
| 31 |
+
|
| 32 |
+
# max length of characters to apply to the "slug" field
|
| 33 |
+
# truncate_slug_length = 40
|
| 34 |
+
|
| 35 |
+
# set to 'true' to run the environment during
|
| 36 |
+
# the 'revision' command, regardless of autogenerate
|
| 37 |
+
# revision_environment = false
|
| 38 |
+
|
| 39 |
+
# set to 'true' to allow .pyc and .pyo files without
|
| 40 |
+
# a source .py file to be detected as revisions in the
|
| 41 |
+
# versions/ directory
|
| 42 |
+
# sourceless = false
|
| 43 |
+
|
| 44 |
+
# version location specification; This defaults
|
| 45 |
+
# to <script_location>/versions. When using multiple version
|
| 46 |
+
# directories, initial revisions must be specified with --version-path.
|
| 47 |
+
# The path separator used here should be the separator specified by "path_separator"
|
| 48 |
+
# below.
|
| 49 |
+
# version_locations = %(here)s/bar:%(here)s/bat:%(here)s/alembic/versions
|
| 50 |
+
|
| 51 |
+
# path_separator; This indicates what character is used to split lists of file
|
| 52 |
+
# paths, including version_locations and prepend_sys_path within configparser
|
| 53 |
+
# files such as alembic.ini.
|
| 54 |
+
# The default rendered in new alembic.ini files is "os", which uses os.pathsep
|
| 55 |
+
# to provide os-dependent path splitting.
|
| 56 |
+
#
|
| 57 |
+
# Note that in order to support legacy alembic.ini files, this default does NOT
|
| 58 |
+
# take place if path_separator is not present in alembic.ini. If this
|
| 59 |
+
# option is omitted entirely, fallback logic is as follows:
|
| 60 |
+
#
|
| 61 |
+
# 1. Parsing of the version_locations option falls back to using the legacy
|
| 62 |
+
# "version_path_separator" key, which if absent then falls back to the legacy
|
| 63 |
+
# behavior of splitting on spaces and/or commas.
|
| 64 |
+
# 2. Parsing of the prepend_sys_path option falls back to the legacy
|
| 65 |
+
# behavior of splitting on spaces, commas, or colons.
|
| 66 |
+
#
|
| 67 |
+
# Valid values for path_separator are:
|
| 68 |
+
#
|
| 69 |
+
# path_separator = :
|
| 70 |
+
# path_separator = ;
|
| 71 |
+
# path_separator = space
|
| 72 |
+
# path_separator = newline
|
| 73 |
+
#
|
| 74 |
+
# Use os.pathsep. Default configuration used for new projects.
|
| 75 |
+
path_separator = os
|
| 76 |
+
|
| 77 |
+
# set to 'true' to search source files recursively
|
| 78 |
+
# in each "version_locations" directory
|
| 79 |
+
# new in Alembic version 1.10
|
| 80 |
+
# recursive_version_locations = false
|
| 81 |
+
|
| 82 |
+
# the output encoding used when revision files
|
| 83 |
+
# are written from script.py.mako
|
| 84 |
+
# output_encoding = utf-8
|
| 85 |
+
|
| 86 |
+
# database URL. This is consumed by the user-maintained env.py script only.
|
| 87 |
+
# other means of configuring database URLs may be customized within the env.py
|
| 88 |
+
# file.
|
| 89 |
+
sqlalchemy.url = driver://user:pass@localhost/dbname
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
[post_write_hooks]
|
| 93 |
+
# post_write_hooks defines scripts or Python functions that are run
|
| 94 |
+
# on newly generated revision scripts. See the documentation for further
|
| 95 |
+
# detail and examples
|
| 96 |
+
|
| 97 |
+
# format using "black" - use the console_scripts runner, against the "black" entrypoint
|
| 98 |
+
# hooks = black
|
| 99 |
+
# black.type = console_scripts
|
| 100 |
+
# black.entrypoint = black
|
| 101 |
+
# black.options = -l 79 REVISION_SCRIPT_FILENAME
|
| 102 |
+
|
| 103 |
+
# lint with attempts to fix using "ruff" - use the module runner, against the "ruff" module
|
| 104 |
+
# hooks = ruff
|
| 105 |
+
# ruff.type = module
|
| 106 |
+
# ruff.module = ruff
|
| 107 |
+
# ruff.options = check --fix REVISION_SCRIPT_FILENAME
|
| 108 |
+
|
| 109 |
+
# Alternatively, use the exec runner to execute a binary found on your PATH
|
| 110 |
+
# hooks = ruff
|
| 111 |
+
# ruff.type = exec
|
| 112 |
+
# ruff.executable = ruff
|
| 113 |
+
# ruff.options = check --fix REVISION_SCRIPT_FILENAME
|
| 114 |
+
|
| 115 |
+
# Logging configuration. This is also consumed by the user-maintained
|
| 116 |
+
# env.py script only.
|
| 117 |
+
[loggers]
|
| 118 |
+
keys = root,sqlalchemy,alembic
|
| 119 |
+
|
| 120 |
+
[handlers]
|
| 121 |
+
keys = console
|
| 122 |
+
|
| 123 |
+
[formatters]
|
| 124 |
+
keys = generic
|
| 125 |
+
|
| 126 |
+
[logger_root]
|
| 127 |
+
level = WARNING
|
| 128 |
+
handlers = console
|
| 129 |
+
qualname =
|
| 130 |
+
|
| 131 |
+
[logger_sqlalchemy]
|
| 132 |
+
level = WARNING
|
| 133 |
+
handlers =
|
| 134 |
+
qualname = sqlalchemy.engine
|
| 135 |
+
|
| 136 |
+
[logger_alembic]
|
| 137 |
+
level = INFO
|
| 138 |
+
handlers =
|
| 139 |
+
qualname = alembic
|
| 140 |
+
|
| 141 |
+
[handler_console]
|
| 142 |
+
class = StreamHandler
|
| 143 |
+
args = (sys.stderr,)
|
| 144 |
+
level = NOTSET
|
| 145 |
+
formatter = generic
|
| 146 |
+
|
| 147 |
+
[formatter_generic]
|
| 148 |
+
format = %(levelname)-5.5s [%(name)s] %(message)s
|
| 149 |
+
datefmt = %H:%M:%S
|
alembic/README
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
Generic single-database configuration.
|
alembic/env.py
ADDED
|
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from logging.config import fileConfig
|
| 2 |
+
|
| 3 |
+
from sqlalchemy import engine_from_config
|
| 4 |
+
from sqlalchemy import pool, create_engine
|
| 5 |
+
|
| 6 |
+
from alembic import context
|
| 7 |
+
|
| 8 |
+
# ---------------------------------------------------------------------------
|
| 9 |
+
# MediGuard AI — Alembic env.py
|
| 10 |
+
# Pull DB URL from settings so we never hard-code credentials.
|
| 11 |
+
# ---------------------------------------------------------------------------
|
| 12 |
+
import sys
|
| 13 |
+
import os
|
| 14 |
+
|
| 15 |
+
# Make sure the project root is on sys.path
|
| 16 |
+
sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))
|
| 17 |
+
|
| 18 |
+
from src.settings import get_settings # noqa: E402
|
| 19 |
+
from src.database import Base # noqa: E402
|
| 20 |
+
|
| 21 |
+
# Import all models so Alembic's autogenerate can see them
|
| 22 |
+
import src.models.analysis # noqa: F401, E402
|
| 23 |
+
|
| 24 |
+
# this is the Alembic Config object, which provides
|
| 25 |
+
# access to the values within the .ini file in use.
|
| 26 |
+
config = context.config
|
| 27 |
+
|
| 28 |
+
# Interpret the config file for Python logging.
|
| 29 |
+
# This line sets up loggers basically.
|
| 30 |
+
if config.config_file_name is not None:
|
| 31 |
+
fileConfig(config.config_file_name)
|
| 32 |
+
|
| 33 |
+
# Override sqlalchemy.url from our Pydantic Settings
|
| 34 |
+
_settings = get_settings()
|
| 35 |
+
config.set_main_option("sqlalchemy.url", _settings.postgres.database_url)
|
| 36 |
+
|
| 37 |
+
# Metadata used for autogenerate
|
| 38 |
+
target_metadata = Base.metadata
|
| 39 |
+
|
| 40 |
+
# other values from the config, defined by the needs of env.py,
|
| 41 |
+
# can be acquired:
|
| 42 |
+
# my_important_option = config.get_main_option("my_important_option")
|
| 43 |
+
# ... etc.
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
def run_migrations_offline() -> None:
|
| 47 |
+
"""Run migrations in 'offline' mode.
|
| 48 |
+
|
| 49 |
+
This configures the context with just a URL
|
| 50 |
+
and not an Engine, though an Engine is acceptable
|
| 51 |
+
here as well. By skipping the Engine creation
|
| 52 |
+
we don't even need a DBAPI to be available.
|
| 53 |
+
|
| 54 |
+
Calls to context.execute() here emit the given string to the
|
| 55 |
+
script output.
|
| 56 |
+
|
| 57 |
+
"""
|
| 58 |
+
url = config.get_main_option("sqlalchemy.url")
|
| 59 |
+
context.configure(
|
| 60 |
+
url=url,
|
| 61 |
+
target_metadata=target_metadata,
|
| 62 |
+
literal_binds=True,
|
| 63 |
+
dialect_opts={"paramstyle": "named"},
|
| 64 |
+
)
|
| 65 |
+
|
| 66 |
+
with context.begin_transaction():
|
| 67 |
+
context.run_migrations()
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
def run_migrations_online() -> None:
|
| 71 |
+
"""Run migrations in 'online' mode.
|
| 72 |
+
|
| 73 |
+
In this scenario we need to create an Engine
|
| 74 |
+
and associate a connection with the context.
|
| 75 |
+
|
| 76 |
+
"""
|
| 77 |
+
connectable = engine_from_config(
|
| 78 |
+
config.get_section(config.config_ini_section, {}),
|
| 79 |
+
prefix="sqlalchemy.",
|
| 80 |
+
poolclass=pool.NullPool,
|
| 81 |
+
)
|
| 82 |
+
|
| 83 |
+
with connectable.connect() as connection:
|
| 84 |
+
context.configure(
|
| 85 |
+
connection=connection, target_metadata=target_metadata
|
| 86 |
+
)
|
| 87 |
+
|
| 88 |
+
with context.begin_transaction():
|
| 89 |
+
context.run_migrations()
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
if context.is_offline_mode():
|
| 93 |
+
run_migrations_offline()
|
| 94 |
+
else:
|
| 95 |
+
run_migrations_online()
|
alembic/script.py.mako
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""${message}
|
| 2 |
+
|
| 3 |
+
Revision ID: ${up_revision}
|
| 4 |
+
Revises: ${down_revision | comma,n}
|
| 5 |
+
Create Date: ${create_date}
|
| 6 |
+
|
| 7 |
+
"""
|
| 8 |
+
from typing import Sequence, Union
|
| 9 |
+
|
| 10 |
+
from alembic import op
|
| 11 |
+
import sqlalchemy as sa
|
| 12 |
+
${imports if imports else ""}
|
| 13 |
+
|
| 14 |
+
# revision identifiers, used by Alembic.
|
| 15 |
+
revision: str = ${repr(up_revision)}
|
| 16 |
+
down_revision: Union[str, Sequence[str], None] = ${repr(down_revision)}
|
| 17 |
+
branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
|
| 18 |
+
depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
def upgrade() -> None:
|
| 22 |
+
"""Upgrade schema."""
|
| 23 |
+
${upgrades if upgrades else "pass"}
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def downgrade() -> None:
|
| 27 |
+
"""Downgrade schema."""
|
| 28 |
+
${downgrades if downgrades else "pass"}
|
data/vector_stores/medical_knowledge.faiss
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e9dee84846c00eda0f0a5487b61c2dd9cc85588ee0cbbcb576df24e8881969e1
|
| 3 |
+
size 4007469
|
data/vector_stores/medical_knowledge.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:690fa693a48c3eb5e0a1fc11b7008a9037630928d9c8a634a31e7f90d8e2f7fb
|
| 3 |
+
size 2727206
|
docker-compose.yml
CHANGED
|
@@ -76,12 +76,13 @@ services:
|
|
| 76 |
restart: unless-stopped
|
| 77 |
|
| 78 |
opensearch:
|
| 79 |
-
image: opensearchproject/opensearch:2.
|
| 80 |
container_name: mediguard-opensearch
|
| 81 |
environment:
|
| 82 |
- discovery.type=single-node
|
| 83 |
- DISABLE_SECURITY_PLUGIN=true
|
| 84 |
-
-
|
|
|
|
| 85 |
- bootstrap.memory_lock=true
|
| 86 |
ulimits:
|
| 87 |
memlock: { soft: -1, hard: -1 }
|
|
@@ -94,21 +95,22 @@ services:
|
|
| 94 |
test: ["CMD-SHELL", "curl -sf http://localhost:9200/_cluster/health || exit 1"]
|
| 95 |
interval: 10s
|
| 96 |
timeout: 5s
|
| 97 |
-
retries:
|
| 98 |
restart: unless-stopped
|
| 99 |
|
| 100 |
-
opensearch-dashboards:
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
|
|
|
| 112 |
|
| 113 |
redis:
|
| 114 |
image: redis:7-alpine
|
|
|
|
| 76 |
restart: unless-stopped
|
| 77 |
|
| 78 |
opensearch:
|
| 79 |
+
image: opensearchproject/opensearch:2.11.1
|
| 80 |
container_name: mediguard-opensearch
|
| 81 |
environment:
|
| 82 |
- discovery.type=single-node
|
| 83 |
- DISABLE_SECURITY_PLUGIN=true
|
| 84 |
+
- plugins.security.disabled=true
|
| 85 |
+
- "OPENSEARCH_JAVA_OPTS=-Xms256m -Xmx256m"
|
| 86 |
- bootstrap.memory_lock=true
|
| 87 |
ulimits:
|
| 88 |
memlock: { soft: -1, hard: -1 }
|
|
|
|
| 95 |
test: ["CMD-SHELL", "curl -sf http://localhost:9200/_cluster/health || exit 1"]
|
| 96 |
interval: 10s
|
| 97 |
timeout: 5s
|
| 98 |
+
retries: 24
|
| 99 |
restart: unless-stopped
|
| 100 |
|
| 101 |
+
# opensearch-dashboards: disabled by default — uncomment if you need the UI
|
| 102 |
+
# opensearch-dashboards:
|
| 103 |
+
# image: opensearchproject/opensearch-dashboards:2.11.1
|
| 104 |
+
# container_name: mediguard-os-dashboards
|
| 105 |
+
# environment:
|
| 106 |
+
# - OPENSEARCH_HOSTS=["http://opensearch:9200"]
|
| 107 |
+
# - DISABLE_SECURITY_DASHBOARDS_PLUGIN=true
|
| 108 |
+
# ports:
|
| 109 |
+
# - "${OS_DASHBOARDS_PORT:-5601}:5601"
|
| 110 |
+
# depends_on:
|
| 111 |
+
# opensearch:
|
| 112 |
+
# condition: service_healthy
|
| 113 |
+
# restart: unless-stopped
|
| 114 |
|
| 115 |
redis:
|
| 116 |
image: redis:7-alpine
|
huggingface/Dockerfile
ADDED
|
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ===========================================================================
|
| 2 |
+
# MediGuard AI — Hugging Face Spaces Dockerfile
|
| 3 |
+
# ===========================================================================
|
| 4 |
+
# Optimized single-container deployment for Hugging Face Spaces.
|
| 5 |
+
# Uses FAISS vector store + Cloud LLMs (Groq/Gemini) - no external services.
|
| 6 |
+
# ===========================================================================
|
| 7 |
+
|
| 8 |
+
FROM python:3.11-slim
|
| 9 |
+
|
| 10 |
+
# Non-interactive apt
|
| 11 |
+
ENV DEBIAN_FRONTEND=noninteractive
|
| 12 |
+
|
| 13 |
+
# Python settings
|
| 14 |
+
ENV PYTHONDONTWRITEBYTECODE=1 \
|
| 15 |
+
PYTHONUNBUFFERED=1 \
|
| 16 |
+
PIP_NO_CACHE_DIR=1 \
|
| 17 |
+
PIP_DISABLE_PIP_VERSION_CHECK=1
|
| 18 |
+
|
| 19 |
+
# HuggingFace Spaces runs on port 7860
|
| 20 |
+
ENV GRADIO_SERVER_NAME="0.0.0.0" \
|
| 21 |
+
GRADIO_SERVER_PORT=7860
|
| 22 |
+
|
| 23 |
+
# Default to HuggingFace embeddings (local, no API key needed)
|
| 24 |
+
ENV EMBEDDING_PROVIDER=huggingface
|
| 25 |
+
|
| 26 |
+
WORKDIR /app
|
| 27 |
+
|
| 28 |
+
# System dependencies
|
| 29 |
+
RUN apt-get update && \
|
| 30 |
+
apt-get install -y --no-install-recommends \
|
| 31 |
+
build-essential \
|
| 32 |
+
curl \
|
| 33 |
+
git \
|
| 34 |
+
&& rm -rf /var/lib/apt/lists/*
|
| 35 |
+
|
| 36 |
+
# Copy requirements first (cache layer)
|
| 37 |
+
COPY huggingface/requirements.txt ./requirements.txt
|
| 38 |
+
RUN pip install --upgrade pip && \
|
| 39 |
+
pip install -r requirements.txt
|
| 40 |
+
|
| 41 |
+
# Copy the entire project
|
| 42 |
+
COPY . .
|
| 43 |
+
|
| 44 |
+
# Create necessary directories and ensure vector store exists
|
| 45 |
+
RUN mkdir -p data/medical_pdfs data/vector_stores data/chat_reports
|
| 46 |
+
|
| 47 |
+
# Create non-root user (HF Spaces requirement)
|
| 48 |
+
RUN useradd -m -u 1000 user
|
| 49 |
+
|
| 50 |
+
# Make app writable by user
|
| 51 |
+
RUN chown -R user:user /app
|
| 52 |
+
|
| 53 |
+
USER user
|
| 54 |
+
ENV HOME=/home/user \
|
| 55 |
+
PATH=/home/user/.local/bin:$PATH
|
| 56 |
+
|
| 57 |
+
WORKDIR /app
|
| 58 |
+
|
| 59 |
+
EXPOSE 7860
|
| 60 |
+
|
| 61 |
+
# Health check
|
| 62 |
+
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
|
| 63 |
+
CMD curl -sf http://localhost:7860/ || exit 1
|
| 64 |
+
|
| 65 |
+
# Launch Gradio app
|
| 66 |
+
CMD ["python", "huggingface/app.py"]
|
huggingface/README.md
ADDED
|
@@ -0,0 +1,111 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: MediGuard AI
|
| 3 |
+
emoji: 🏥
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: cyan
|
| 6 |
+
sdk: docker
|
| 7 |
+
pinned: true
|
| 8 |
+
license: mit
|
| 9 |
+
app_port: 7860
|
| 10 |
+
models:
|
| 11 |
+
- meta-llama/Llama-3.3-70B-Versatile
|
| 12 |
+
tags:
|
| 13 |
+
- medical
|
| 14 |
+
- biomarker
|
| 15 |
+
- rag
|
| 16 |
+
- healthcare
|
| 17 |
+
- langgraph
|
| 18 |
+
- agents
|
| 19 |
+
short_description: Multi-Agent RAG System for Medical Biomarker Analysis
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
# 🏥 MediGuard AI — Medical Biomarker Analysis
|
| 23 |
+
|
| 24 |
+
A production-ready **Multi-Agent RAG System** that analyzes blood test biomarkers using 6 specialized AI agents with medical knowledge retrieval.
|
| 25 |
+
|
| 26 |
+
## ✨ Features
|
| 27 |
+
|
| 28 |
+
- **6 Specialist AI Agents** — Biomarker validation, disease prediction, RAG-powered analysis, confidence assessment
|
| 29 |
+
- **Medical Knowledge Base** — 750+ pages of clinical guidelines (FAISS vector store)
|
| 30 |
+
- **Evidence-Based** — All recommendations backed by retrieved medical literature
|
| 31 |
+
- **Free Cloud LLMs** — Uses Groq (LLaMA 3.3-70B) or Google Gemini
|
| 32 |
+
|
| 33 |
+
## 🚀 Quick Start
|
| 34 |
+
|
| 35 |
+
1. **Enter your biomarkers** in any format:
|
| 36 |
+
- `Glucose: 140, HbA1c: 7.5`
|
| 37 |
+
- `My glucose is 140 and HbA1c is 7.5`
|
| 38 |
+
- `{"Glucose": 140, "HbA1c": 7.5}`
|
| 39 |
+
|
| 40 |
+
2. **Click Analyze** and get:
|
| 41 |
+
- Primary diagnosis with confidence score
|
| 42 |
+
- Critical alerts and safety flags
|
| 43 |
+
- Biomarker analysis with normal ranges
|
| 44 |
+
- Evidence-based recommendations
|
| 45 |
+
- Disease pathophysiology explanation
|
| 46 |
+
|
| 47 |
+
## 🔧 Configuration
|
| 48 |
+
|
| 49 |
+
This Space requires an LLM API key. Add one of these secrets in Space Settings:
|
| 50 |
+
|
| 51 |
+
| Secret | Provider | Get Free Key |
|
| 52 |
+
|--------|----------|--------------|
|
| 53 |
+
| `GROQ_API_KEY` | Groq (recommended) | [console.groq.com/keys](https://console.groq.com/keys) |
|
| 54 |
+
| `GOOGLE_API_KEY` | Google Gemini | [aistudio.google.com](https://aistudio.google.com/app/apikey) |
|
| 55 |
+
|
| 56 |
+
## 🏗️ Architecture
|
| 57 |
+
|
| 58 |
+
```
|
| 59 |
+
┌─────────────────────────────────────────────────────────┐
|
| 60 |
+
│ Clinical Insight Guild │
|
| 61 |
+
├─────────────────────────────────────────────────────────┤
|
| 62 |
+
│ ┌───────────────────────────────────────────────────┐ │
|
| 63 |
+
│ │ 1. Biomarker Analyzer │ │
|
| 64 |
+
│ │ Validates values, flags abnormalities │ │
|
| 65 |
+
│ └───────────────────┬───────────────────────────────┘ │
|
| 66 |
+
│ │ │
|
| 67 |
+
│ ┌────────────┼────────────┐ │
|
| 68 |
+
│ ▼ ▼ ▼ │
|
| 69 |
+
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
| 70 |
+
│ │ Disease │ │Biomarker │ │ Clinical │ │
|
| 71 |
+
│ │Explainer │ │ Linker │ │Guidelines│ │
|
| 72 |
+
│ │ (RAG) │ │ │ │ (RAG) │ │
|
| 73 |
+
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
| 74 |
+
│ │ │ │ │
|
| 75 |
+
│ └────────────┼────────────┘ │
|
| 76 |
+
│ ▼ │
|
| 77 |
+
│ ┌───────────────────────────────────────────────────┐ │
|
| 78 |
+
│ │ 4. Confidence Assessor │ │
|
| 79 |
+
│ │ Evaluates reliability, assigns scores │ │
|
| 80 |
+
│ └───────────────────┬───────────────────────────────┘ │
|
| 81 |
+
│ ▼ │
|
| 82 |
+
│ ┌───────────────────────────────────────────────────┐ │
|
| 83 |
+
│ │ 5. Response Synthesizer │ │
|
| 84 |
+
│ │ Compiles patient-friendly summary │ │
|
| 85 |
+
│ └───────────────────────────────────────────────────┘ │
|
| 86 |
+
└─────────────────────────────────────────────────────────┘
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
## 📊 Supported Biomarkers
|
| 90 |
+
|
| 91 |
+
| Category | Biomarkers |
|
| 92 |
+
|----------|------------|
|
| 93 |
+
| **Diabetes** | Glucose, HbA1c, Fasting Glucose, Insulin |
|
| 94 |
+
| **Lipids** | Cholesterol, LDL, HDL, Triglycerides |
|
| 95 |
+
| **Kidney** | Creatinine, BUN, eGFR |
|
| 96 |
+
| **Liver** | ALT, AST, Bilirubin, Albumin |
|
| 97 |
+
| **Thyroid** | TSH, T3, T4, Free T4 |
|
| 98 |
+
| **Blood** | Hemoglobin, WBC, RBC, Platelets |
|
| 99 |
+
| **Cardiac** | Troponin, BNP, CRP |
|
| 100 |
+
|
| 101 |
+
## ⚠️ Medical Disclaimer
|
| 102 |
+
|
| 103 |
+
This tool is for **informational purposes only** and does not replace professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider with questions regarding a medical condition.
|
| 104 |
+
|
| 105 |
+
## 📄 License
|
| 106 |
+
|
| 107 |
+
MIT License — See [GitHub Repository](https://github.com/yourusername/ragbot) for details.
|
| 108 |
+
|
| 109 |
+
## 🙏 Acknowledgments
|
| 110 |
+
|
| 111 |
+
Built with [LangGraph](https://langchain-ai.github.io/langgraph/), [FAISS](https://faiss.ai/), [Gradio](https://gradio.app/), and [Groq](https://groq.com/).
|
huggingface/app.py
ADDED
|
@@ -0,0 +1,532 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
MediGuard AI — Hugging Face Spaces Gradio App
|
| 3 |
+
|
| 4 |
+
Standalone deployment that uses:
|
| 5 |
+
- FAISS vector store (local)
|
| 6 |
+
- Cloud LLMs (Groq or Gemini - FREE tiers)
|
| 7 |
+
- No external services required
|
| 8 |
+
"""
|
| 9 |
+
|
| 10 |
+
from __future__ import annotations
|
| 11 |
+
|
| 12 |
+
import json
|
| 13 |
+
import logging
|
| 14 |
+
import os
|
| 15 |
+
import sys
|
| 16 |
+
import time
|
| 17 |
+
import traceback
|
| 18 |
+
from pathlib import Path
|
| 19 |
+
from typing import Any, Optional
|
| 20 |
+
|
| 21 |
+
# Ensure project root is in path
|
| 22 |
+
_project_root = str(Path(__file__).parent.parent)
|
| 23 |
+
if _project_root not in sys.path:
|
| 24 |
+
sys.path.insert(0, _project_root)
|
| 25 |
+
os.chdir(_project_root)
|
| 26 |
+
|
| 27 |
+
import gradio as gr
|
| 28 |
+
|
| 29 |
+
logging.basicConfig(
|
| 30 |
+
level=logging.INFO,
|
| 31 |
+
format="%(asctime)s | %(name)-20s | %(levelname)-7s | %(message)s",
|
| 32 |
+
)
|
| 33 |
+
logger = logging.getLogger("mediguard.huggingface")
|
| 34 |
+
|
| 35 |
+
# ---------------------------------------------------------------------------
|
| 36 |
+
# Configuration
|
| 37 |
+
# ---------------------------------------------------------------------------
|
| 38 |
+
|
| 39 |
+
# Check for required API keys
|
| 40 |
+
GROQ_API_KEY = os.getenv("GROQ_API_KEY", "")
|
| 41 |
+
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY", "")
|
| 42 |
+
|
| 43 |
+
if not GROQ_API_KEY and not GOOGLE_API_KEY:
|
| 44 |
+
logger.warning(
|
| 45 |
+
"No LLM API key found. Set GROQ_API_KEY or GOOGLE_API_KEY environment variable."
|
| 46 |
+
)
|
| 47 |
+
|
| 48 |
+
# Set default provider based on available keys
|
| 49 |
+
if GROQ_API_KEY:
|
| 50 |
+
os.environ.setdefault("LLM_PROVIDER", "groq")
|
| 51 |
+
elif GOOGLE_API_KEY:
|
| 52 |
+
os.environ.setdefault("LLM_PROVIDER", "gemini")
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
# ---------------------------------------------------------------------------
|
| 56 |
+
# Guild Initialization (lazy)
|
| 57 |
+
# ---------------------------------------------------------------------------
|
| 58 |
+
|
| 59 |
+
_guild = None
|
| 60 |
+
_guild_error = None
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
def get_guild():
|
| 64 |
+
"""Lazy initialization of the Clinical Insight Guild."""
|
| 65 |
+
global _guild, _guild_error
|
| 66 |
+
|
| 67 |
+
if _guild is not None:
|
| 68 |
+
return _guild
|
| 69 |
+
|
| 70 |
+
if _guild_error is not None:
|
| 71 |
+
raise _guild_error
|
| 72 |
+
|
| 73 |
+
try:
|
| 74 |
+
logger.info("Initializing Clinical Insight Guild...")
|
| 75 |
+
start = time.time()
|
| 76 |
+
|
| 77 |
+
from src.workflow import create_guild
|
| 78 |
+
_guild = create_guild()
|
| 79 |
+
|
| 80 |
+
elapsed = time.time() - start
|
| 81 |
+
logger.info(f"Guild initialized in {elapsed:.1f}s")
|
| 82 |
+
return _guild
|
| 83 |
+
|
| 84 |
+
except Exception as exc:
|
| 85 |
+
logger.error(f"Failed to initialize guild: {exc}")
|
| 86 |
+
_guild_error = exc
|
| 87 |
+
raise
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
# ---------------------------------------------------------------------------
|
| 91 |
+
# Analysis Functions
|
| 92 |
+
# ---------------------------------------------------------------------------
|
| 93 |
+
|
| 94 |
+
def parse_biomarkers(text: str) -> dict[str, float]:
|
| 95 |
+
"""
|
| 96 |
+
Parse biomarkers from natural language text.
|
| 97 |
+
|
| 98 |
+
Supports formats like:
|
| 99 |
+
- "Glucose: 140, HbA1c: 7.5"
|
| 100 |
+
- "glucose 140 hba1c 7.5"
|
| 101 |
+
- {"Glucose": 140, "HbA1c": 7.5}
|
| 102 |
+
"""
|
| 103 |
+
text = text.strip()
|
| 104 |
+
|
| 105 |
+
# Try JSON first
|
| 106 |
+
if text.startswith("{"):
|
| 107 |
+
try:
|
| 108 |
+
return json.loads(text)
|
| 109 |
+
except json.JSONDecodeError:
|
| 110 |
+
pass
|
| 111 |
+
|
| 112 |
+
# Parse natural language
|
| 113 |
+
import re
|
| 114 |
+
|
| 115 |
+
# Common biomarker patterns
|
| 116 |
+
patterns = [
|
| 117 |
+
# "Glucose: 140" or "Glucose = 140"
|
| 118 |
+
r"([A-Za-z0-9_]+)\s*[:=]\s*([\d.]+)",
|
| 119 |
+
# "Glucose 140 mg/dL"
|
| 120 |
+
r"([A-Za-z0-9_]+)\s+([\d.]+)\s*(?:mg/dL|mmol/L|%|g/dL|U/L|mIU/L)?",
|
| 121 |
+
]
|
| 122 |
+
|
| 123 |
+
biomarkers = {}
|
| 124 |
+
|
| 125 |
+
for pattern in patterns:
|
| 126 |
+
matches = re.findall(pattern, text, re.IGNORECASE)
|
| 127 |
+
for name, value in matches:
|
| 128 |
+
try:
|
| 129 |
+
biomarkers[name.strip()] = float(value)
|
| 130 |
+
except ValueError:
|
| 131 |
+
continue
|
| 132 |
+
|
| 133 |
+
return biomarkers
|
| 134 |
+
|
| 135 |
+
|
| 136 |
+
def analyze_biomarkers(input_text: str, progress=gr.Progress()) -> tuple[str, str, str]:
|
| 137 |
+
"""
|
| 138 |
+
Analyze biomarkers using the Clinical Insight Guild.
|
| 139 |
+
|
| 140 |
+
Returns: (summary, details_json, status)
|
| 141 |
+
"""
|
| 142 |
+
if not input_text.strip():
|
| 143 |
+
return "", "", "⚠️ Please enter biomarkers to analyze."
|
| 144 |
+
|
| 145 |
+
# Check API key
|
| 146 |
+
if not GROQ_API_KEY and not GOOGLE_API_KEY:
|
| 147 |
+
return "", "", (
|
| 148 |
+
"❌ **Error**: No LLM API key configured.\n\n"
|
| 149 |
+
"Please add your API key in Hugging Face Space Settings → Secrets:\n"
|
| 150 |
+
"- `GROQ_API_KEY` (get free at https://console.groq.com/keys)\n"
|
| 151 |
+
"- or `GOOGLE_API_KEY` (get free at https://aistudio.google.com/app/apikey)"
|
| 152 |
+
)
|
| 153 |
+
|
| 154 |
+
try:
|
| 155 |
+
progress(0.1, desc="Parsing biomarkers...")
|
| 156 |
+
biomarkers = parse_biomarkers(input_text)
|
| 157 |
+
|
| 158 |
+
if not biomarkers:
|
| 159 |
+
return "", "", (
|
| 160 |
+
"⚠️ Could not parse biomarkers. Try formats like:\n"
|
| 161 |
+
"• `Glucose: 140, HbA1c: 7.5`\n"
|
| 162 |
+
"• `{\"Glucose\": 140, \"HbA1c\": 7.5}`"
|
| 163 |
+
)
|
| 164 |
+
|
| 165 |
+
progress(0.2, desc="Initializing analysis...")
|
| 166 |
+
|
| 167 |
+
# Initialize guild
|
| 168 |
+
guild = get_guild()
|
| 169 |
+
|
| 170 |
+
# Prepare input
|
| 171 |
+
from src.state import PatientInput
|
| 172 |
+
|
| 173 |
+
# Auto-generate prediction based on common patterns
|
| 174 |
+
prediction = auto_predict(biomarkers)
|
| 175 |
+
|
| 176 |
+
patient_input = PatientInput(
|
| 177 |
+
biomarkers=biomarkers,
|
| 178 |
+
model_prediction=prediction,
|
| 179 |
+
patient_context={"patient_id": "HF_User", "source": "huggingface_spaces"}
|
| 180 |
+
)
|
| 181 |
+
|
| 182 |
+
progress(0.4, desc="Running Clinical Insight Guild...")
|
| 183 |
+
|
| 184 |
+
# Run analysis
|
| 185 |
+
start = time.time()
|
| 186 |
+
result = guild.run(patient_input)
|
| 187 |
+
elapsed = time.time() - start
|
| 188 |
+
|
| 189 |
+
progress(0.9, desc="Formatting results...")
|
| 190 |
+
|
| 191 |
+
# Extract response
|
| 192 |
+
final_response = result.get("final_response", {})
|
| 193 |
+
|
| 194 |
+
# Format summary
|
| 195 |
+
summary = format_summary(final_response, elapsed)
|
| 196 |
+
|
| 197 |
+
# Format details
|
| 198 |
+
details = json.dumps(final_response, indent=2, default=str)
|
| 199 |
+
|
| 200 |
+
status = f"✅ Analysis completed in {elapsed:.1f}s"
|
| 201 |
+
|
| 202 |
+
return summary, details, status
|
| 203 |
+
|
| 204 |
+
except Exception as exc:
|
| 205 |
+
logger.error(f"Analysis error: {exc}", exc_info=True)
|
| 206 |
+
return "", "", f"❌ **Error**: {exc}\n\n```\n{traceback.format_exc()}\n```"
|
| 207 |
+
|
| 208 |
+
|
| 209 |
+
def auto_predict(biomarkers: dict[str, float]) -> dict[str, Any]:
|
| 210 |
+
"""
|
| 211 |
+
Auto-generate a disease prediction based on biomarkers.
|
| 212 |
+
This simulates what an ML model would provide.
|
| 213 |
+
"""
|
| 214 |
+
# Normalize biomarker names for matching
|
| 215 |
+
normalized = {k.lower().replace(" ", ""): v for k, v in biomarkers.items()}
|
| 216 |
+
|
| 217 |
+
# Check for diabetes indicators
|
| 218 |
+
glucose = normalized.get("glucose", normalized.get("fastingglucose", 0))
|
| 219 |
+
hba1c = normalized.get("hba1c", normalized.get("hemoglobina1c", 0))
|
| 220 |
+
|
| 221 |
+
if hba1c >= 6.5 or glucose >= 126:
|
| 222 |
+
return {
|
| 223 |
+
"disease": "Diabetes",
|
| 224 |
+
"confidence": min(0.95, 0.7 + (hba1c - 6.5) * 0.1) if hba1c else 0.85,
|
| 225 |
+
"severity": "high" if hba1c >= 8 or glucose >= 200 else "moderate"
|
| 226 |
+
}
|
| 227 |
+
|
| 228 |
+
# Check for lipid disorders
|
| 229 |
+
cholesterol = normalized.get("cholesterol", normalized.get("totalcholesterol", 0))
|
| 230 |
+
ldl = normalized.get("ldl", normalized.get("ldlcholesterol", 0))
|
| 231 |
+
triglycerides = normalized.get("triglycerides", 0)
|
| 232 |
+
|
| 233 |
+
if cholesterol >= 240 or ldl >= 160 or triglycerides >= 200:
|
| 234 |
+
return {
|
| 235 |
+
"disease": "Dyslipidemia",
|
| 236 |
+
"confidence": 0.85,
|
| 237 |
+
"severity": "moderate"
|
| 238 |
+
}
|
| 239 |
+
|
| 240 |
+
# Check for anemia
|
| 241 |
+
hemoglobin = normalized.get("hemoglobin", normalized.get("hgb", normalized.get("hb", 0)))
|
| 242 |
+
|
| 243 |
+
if hemoglobin and hemoglobin < 12:
|
| 244 |
+
return {
|
| 245 |
+
"disease": "Anemia",
|
| 246 |
+
"confidence": 0.80,
|
| 247 |
+
"severity": "moderate"
|
| 248 |
+
}
|
| 249 |
+
|
| 250 |
+
# Check for thyroid issues
|
| 251 |
+
tsh = normalized.get("tsh", 0)
|
| 252 |
+
|
| 253 |
+
if tsh > 4.5:
|
| 254 |
+
return {
|
| 255 |
+
"disease": "Hypothyroidism",
|
| 256 |
+
"confidence": 0.75,
|
| 257 |
+
"severity": "moderate"
|
| 258 |
+
}
|
| 259 |
+
elif tsh and tsh < 0.4:
|
| 260 |
+
return {
|
| 261 |
+
"disease": "Hyperthyroidism",
|
| 262 |
+
"confidence": 0.75,
|
| 263 |
+
"severity": "moderate"
|
| 264 |
+
}
|
| 265 |
+
|
| 266 |
+
# Default - general health screening
|
| 267 |
+
return {
|
| 268 |
+
"disease": "General Health Screening",
|
| 269 |
+
"confidence": 0.70,
|
| 270 |
+
"severity": "low"
|
| 271 |
+
}
|
| 272 |
+
|
| 273 |
+
|
| 274 |
+
def format_summary(response: dict, elapsed: float) -> str:
|
| 275 |
+
"""Format the analysis response as readable markdown."""
|
| 276 |
+
if not response:
|
| 277 |
+
return "No analysis results available."
|
| 278 |
+
|
| 279 |
+
parts = []
|
| 280 |
+
|
| 281 |
+
# Header
|
| 282 |
+
primary = response.get("primary_finding", "Analysis")
|
| 283 |
+
confidence = response.get("confidence", {})
|
| 284 |
+
conf_score = confidence.get("overall_score", 0) if isinstance(confidence, dict) else 0
|
| 285 |
+
|
| 286 |
+
parts.append(f"## 🏥 {primary}")
|
| 287 |
+
if conf_score:
|
| 288 |
+
parts.append(f"**Confidence**: {conf_score:.0%}")
|
| 289 |
+
parts.append("")
|
| 290 |
+
|
| 291 |
+
# Critical Alerts
|
| 292 |
+
alerts = response.get("safety_alerts", [])
|
| 293 |
+
if alerts:
|
| 294 |
+
parts.append("### ⚠️ Critical Alerts")
|
| 295 |
+
for alert in alerts[:5]:
|
| 296 |
+
if isinstance(alert, dict):
|
| 297 |
+
parts.append(f"- **{alert.get('alert_type', 'Alert')}**: {alert.get('message', '')}")
|
| 298 |
+
else:
|
| 299 |
+
parts.append(f"- {alert}")
|
| 300 |
+
parts.append("")
|
| 301 |
+
|
| 302 |
+
# Key Findings
|
| 303 |
+
findings = response.get("key_findings", [])
|
| 304 |
+
if findings:
|
| 305 |
+
parts.append("### 🔍 Key Findings")
|
| 306 |
+
for finding in findings[:5]:
|
| 307 |
+
parts.append(f"- {finding}")
|
| 308 |
+
parts.append("")
|
| 309 |
+
|
| 310 |
+
# Biomarker Flags
|
| 311 |
+
flags = response.get("biomarker_flags", [])
|
| 312 |
+
if flags:
|
| 313 |
+
parts.append("### 📊 Biomarker Analysis")
|
| 314 |
+
for flag in flags[:8]:
|
| 315 |
+
if isinstance(flag, dict):
|
| 316 |
+
name = flag.get("biomarker", "Unknown")
|
| 317 |
+
status = flag.get("status", "normal")
|
| 318 |
+
value = flag.get("value", "N/A")
|
| 319 |
+
emoji = "🔴" if status == "critical" else "🟡" if status == "abnormal" else "🟢"
|
| 320 |
+
parts.append(f"- {emoji} **{name}**: {value} ({status})")
|
| 321 |
+
else:
|
| 322 |
+
parts.append(f"- {flag}")
|
| 323 |
+
parts.append("")
|
| 324 |
+
|
| 325 |
+
# Recommendations
|
| 326 |
+
recs = response.get("recommendations", {})
|
| 327 |
+
if recs:
|
| 328 |
+
parts.append("### 💡 Recommendations")
|
| 329 |
+
|
| 330 |
+
immediate = recs.get("immediate_actions", [])
|
| 331 |
+
if immediate:
|
| 332 |
+
parts.append("**Immediate Actions:**")
|
| 333 |
+
for action in immediate[:3]:
|
| 334 |
+
parts.append(f"- {action}")
|
| 335 |
+
|
| 336 |
+
lifestyle = recs.get("lifestyle_modifications", [])
|
| 337 |
+
if lifestyle:
|
| 338 |
+
parts.append("\n**Lifestyle Modifications:**")
|
| 339 |
+
for mod in lifestyle[:3]:
|
| 340 |
+
parts.append(f"- {mod}")
|
| 341 |
+
|
| 342 |
+
followup = recs.get("follow_up", [])
|
| 343 |
+
if followup:
|
| 344 |
+
parts.append("\n**Follow-up:**")
|
| 345 |
+
for item in followup[:3]:
|
| 346 |
+
parts.append(f"- {item}")
|
| 347 |
+
parts.append("")
|
| 348 |
+
|
| 349 |
+
# Disease Explanation
|
| 350 |
+
explanation = response.get("disease_explanation", {})
|
| 351 |
+
if explanation and isinstance(explanation, dict):
|
| 352 |
+
parts.append("### 📖 Understanding Your Results")
|
| 353 |
+
|
| 354 |
+
pathophys = explanation.get("pathophysiology", "")
|
| 355 |
+
if pathophys:
|
| 356 |
+
parts.append(f"{pathophys[:500]}...")
|
| 357 |
+
parts.append("")
|
| 358 |
+
|
| 359 |
+
# Conversational Summary
|
| 360 |
+
conv_summary = response.get("conversational_summary", "")
|
| 361 |
+
if conv_summary:
|
| 362 |
+
parts.append("### 📝 Summary")
|
| 363 |
+
parts.append(conv_summary[:1000])
|
| 364 |
+
parts.append("")
|
| 365 |
+
|
| 366 |
+
# Footer
|
| 367 |
+
parts.append("---")
|
| 368 |
+
parts.append(f"*Analysis completed in {elapsed:.1f}s using MediGuard AI*")
|
| 369 |
+
parts.append("")
|
| 370 |
+
parts.append("**⚠️ Disclaimer**: This is for informational purposes only. "
|
| 371 |
+
"Consult a healthcare professional for medical advice.")
|
| 372 |
+
|
| 373 |
+
return "\n".join(parts)
|
| 374 |
+
|
| 375 |
+
|
| 376 |
+
# ---------------------------------------------------------------------------
|
| 377 |
+
# Gradio Interface
|
| 378 |
+
# ---------------------------------------------------------------------------
|
| 379 |
+
|
| 380 |
+
def create_demo() -> gr.Blocks:
|
| 381 |
+
"""Create the Gradio Blocks interface."""
|
| 382 |
+
|
| 383 |
+
with gr.Blocks(
|
| 384 |
+
title="MediGuard AI - Medical Biomarker Analysis",
|
| 385 |
+
theme=gr.themes.Soft(primary_hue="blue", secondary_hue="cyan"),
|
| 386 |
+
css="""
|
| 387 |
+
.gradio-container { max-width: 1200px !important; }
|
| 388 |
+
.status-box { font-size: 14px; }
|
| 389 |
+
footer { display: none !important; }
|
| 390 |
+
"""
|
| 391 |
+
) as demo:
|
| 392 |
+
|
| 393 |
+
# Header
|
| 394 |
+
gr.Markdown("""
|
| 395 |
+
# 🏥 MediGuard AI — Medical Biomarker Analysis
|
| 396 |
+
|
| 397 |
+
**Multi-Agent RAG System** powered by 6 specialized AI agents with medical knowledge retrieval.
|
| 398 |
+
|
| 399 |
+
Enter your biomarkers below and get evidence-based insights in seconds.
|
| 400 |
+
""")
|
| 401 |
+
|
| 402 |
+
# API Key warning (if needed)
|
| 403 |
+
if not GROQ_API_KEY and not GOOGLE_API_KEY:
|
| 404 |
+
gr.Markdown("""
|
| 405 |
+
<div style="background: #ffeeba; padding: 10px; border-radius: 5px; margin: 10px 0;">
|
| 406 |
+
⚠️ <b>API Key Required</b>: Add <code>GROQ_API_KEY</code> or <code>GOOGLE_API_KEY</code>
|
| 407 |
+
in Space Settings → Secrets to enable analysis.
|
| 408 |
+
</div>
|
| 409 |
+
""")
|
| 410 |
+
|
| 411 |
+
with gr.Row():
|
| 412 |
+
# Input column
|
| 413 |
+
with gr.Column(scale=1):
|
| 414 |
+
gr.Markdown("### 📝 Enter Biomarkers")
|
| 415 |
+
|
| 416 |
+
input_text = gr.Textbox(
|
| 417 |
+
label="Biomarkers",
|
| 418 |
+
placeholder=(
|
| 419 |
+
"Enter biomarkers in any format:\n"
|
| 420 |
+
"• Glucose: 140, HbA1c: 7.5, Cholesterol: 210\n"
|
| 421 |
+
"• My glucose is 140 and HbA1c is 7.5\n"
|
| 422 |
+
'• {"Glucose": 140, "HbA1c": 7.5}'
|
| 423 |
+
),
|
| 424 |
+
lines=5,
|
| 425 |
+
max_lines=10,
|
| 426 |
+
)
|
| 427 |
+
|
| 428 |
+
with gr.Row():
|
| 429 |
+
analyze_btn = gr.Button("🔬 Analyze", variant="primary", size="lg")
|
| 430 |
+
clear_btn = gr.Button("🗑️ Clear", size="lg")
|
| 431 |
+
|
| 432 |
+
status_output = gr.Markdown(
|
| 433 |
+
label="Status",
|
| 434 |
+
elem_classes="status-box"
|
| 435 |
+
)
|
| 436 |
+
|
| 437 |
+
# Example inputs
|
| 438 |
+
gr.Markdown("### 📋 Example Inputs")
|
| 439 |
+
|
| 440 |
+
examples = gr.Examples(
|
| 441 |
+
examples=[
|
| 442 |
+
["Glucose: 185, HbA1c: 8.2, Cholesterol: 245, LDL: 165"],
|
| 443 |
+
["Glucose: 95, HbA1c: 5.4, Cholesterol: 180, HDL: 55, LDL: 100"],
|
| 444 |
+
["Hemoglobin: 9.5, Iron: 40, Ferritin: 15"],
|
| 445 |
+
["TSH: 8.5, T4: 4.0, T3: 80"],
|
| 446 |
+
['{"Glucose": 140, "HbA1c": 7.0, "Triglycerides": 250}'],
|
| 447 |
+
],
|
| 448 |
+
inputs=input_text,
|
| 449 |
+
label="Click an example to load it",
|
| 450 |
+
)
|
| 451 |
+
|
| 452 |
+
# Output column
|
| 453 |
+
with gr.Column(scale=2):
|
| 454 |
+
gr.Markdown("### 📊 Analysis Results")
|
| 455 |
+
|
| 456 |
+
with gr.Tabs():
|
| 457 |
+
with gr.Tab("Summary"):
|
| 458 |
+
summary_output = gr.Markdown(
|
| 459 |
+
label="Analysis Summary",
|
| 460 |
+
value="*Enter biomarkers and click Analyze to see results*"
|
| 461 |
+
)
|
| 462 |
+
|
| 463 |
+
with gr.Tab("Detailed JSON"):
|
| 464 |
+
details_output = gr.Code(
|
| 465 |
+
label="Full Response",
|
| 466 |
+
language="json",
|
| 467 |
+
lines=25,
|
| 468 |
+
)
|
| 469 |
+
|
| 470 |
+
# Event handlers
|
| 471 |
+
analyze_btn.click(
|
| 472 |
+
fn=analyze_biomarkers,
|
| 473 |
+
inputs=[input_text],
|
| 474 |
+
outputs=[summary_output, details_output, status_output],
|
| 475 |
+
show_progress="full",
|
| 476 |
+
)
|
| 477 |
+
|
| 478 |
+
clear_btn.click(
|
| 479 |
+
fn=lambda: ("", "", "", ""),
|
| 480 |
+
outputs=[input_text, summary_output, details_output, status_output],
|
| 481 |
+
)
|
| 482 |
+
|
| 483 |
+
# Footer
|
| 484 |
+
gr.Markdown("""
|
| 485 |
+
---
|
| 486 |
+
|
| 487 |
+
### ℹ️ About MediGuard AI
|
| 488 |
+
|
| 489 |
+
MediGuard AI uses a **Clinical Insight Guild** of 6 specialized AI agents:
|
| 490 |
+
|
| 491 |
+
| Agent | Role |
|
| 492 |
+
|-------|------|
|
| 493 |
+
| 🔬 Biomarker Analyzer | Validates and flags abnormal values |
|
| 494 |
+
| 📚 Disease Explainer | RAG-powered pathophysiology explanations |
|
| 495 |
+
| 🔗 Biomarker Linker | Connects biomarkers to disease predictions |
|
| 496 |
+
| 📋 Clinical Guidelines | Evidence-based recommendations from medical literature |
|
| 497 |
+
| ✅ Confidence Assessor | Evaluates reliability of findings |
|
| 498 |
+
| 📝 Response Synthesizer | Compiles comprehensive patient-friendly output |
|
| 499 |
+
|
| 500 |
+
**Data Sources**: 750+ pages of clinical guidelines (FAISS vector store)
|
| 501 |
+
|
| 502 |
+
---
|
| 503 |
+
|
| 504 |
+
⚠️ **Medical Disclaimer**: This tool is for **informational purposes only** and does not
|
| 505 |
+
replace professional medical advice, diagnosis, or treatment. Always consult a qualified
|
| 506 |
+
healthcare provider with questions regarding a medical condition.
|
| 507 |
+
|
| 508 |
+
---
|
| 509 |
+
|
| 510 |
+
Built with ❤️ using [LangGraph](https://langchain-ai.github.io/langgraph/),
|
| 511 |
+
[FAISS](https://faiss.ai/), and [Gradio](https://gradio.app/)
|
| 512 |
+
""")
|
| 513 |
+
|
| 514 |
+
return demo
|
| 515 |
+
|
| 516 |
+
|
| 517 |
+
# ---------------------------------------------------------------------------
|
| 518 |
+
# Main Entry Point
|
| 519 |
+
# ---------------------------------------------------------------------------
|
| 520 |
+
|
| 521 |
+
if __name__ == "__main__":
|
| 522 |
+
logger.info("Starting MediGuard AI Gradio App...")
|
| 523 |
+
|
| 524 |
+
demo = create_demo()
|
| 525 |
+
|
| 526 |
+
# Launch with HF Spaces compatible settings
|
| 527 |
+
demo.launch(
|
| 528 |
+
server_name="0.0.0.0",
|
| 529 |
+
server_port=7860,
|
| 530 |
+
show_error=True,
|
| 531 |
+
# share=False on HF Spaces
|
| 532 |
+
)
|
huggingface/requirements.txt
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ===========================================================================
|
| 2 |
+
# MediGuard AI — Hugging Face Spaces Dependencies
|
| 3 |
+
# ===========================================================================
|
| 4 |
+
# Minimal dependencies for standalone Gradio deployment.
|
| 5 |
+
# No postgres, redis, opensearch, ollama required.
|
| 6 |
+
# ===========================================================================
|
| 7 |
+
|
| 8 |
+
# --- Gradio UI ---
|
| 9 |
+
gradio>=5.0.0
|
| 10 |
+
|
| 11 |
+
# --- LangChain Core ---
|
| 12 |
+
langchain>=0.3.0
|
| 13 |
+
langchain-community>=0.3.0
|
| 14 |
+
langgraph>=0.2.0
|
| 15 |
+
|
| 16 |
+
# --- Cloud LLM Providers (FREE tiers) ---
|
| 17 |
+
langchain-groq>=0.2.0
|
| 18 |
+
langchain-google-genai>=2.0.0
|
| 19 |
+
|
| 20 |
+
# --- Vector Store ---
|
| 21 |
+
faiss-cpu>=1.8.0
|
| 22 |
+
|
| 23 |
+
# --- Embeddings ---
|
| 24 |
+
sentence-transformers>=3.0.0
|
| 25 |
+
|
| 26 |
+
# --- Document Processing ---
|
| 27 |
+
pypdf>=4.0.0
|
| 28 |
+
|
| 29 |
+
# --- Pydantic ---
|
| 30 |
+
pydantic>=2.9.0
|
| 31 |
+
pydantic-settings>=2.5.0
|
| 32 |
+
|
| 33 |
+
# --- HTTP Client ---
|
| 34 |
+
httpx>=0.27.0
|
| 35 |
+
|
| 36 |
+
# --- Utilities ---
|
| 37 |
+
python-dotenv>=1.0.0
|
| 38 |
+
numpy<2.0.0
|
scripts/deploy_huggingface.ps1
ADDED
|
@@ -0,0 +1,139 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<#
|
| 2 |
+
.SYNOPSIS
|
| 3 |
+
Deploy MediGuard AI to Hugging Face Spaces
|
| 4 |
+
.DESCRIPTION
|
| 5 |
+
This script automates the deployment of MediGuard AI to Hugging Face Spaces.
|
| 6 |
+
It handles copying files, setting up the Dockerfile, and pushing to the Space.
|
| 7 |
+
.PARAMETER SpaceName
|
| 8 |
+
Name of your Hugging Face Space (e.g., "mediguard-ai")
|
| 9 |
+
.PARAMETER Username
|
| 10 |
+
Your Hugging Face username
|
| 11 |
+
.PARAMETER SkipClone
|
| 12 |
+
Skip cloning if you've already cloned the Space
|
| 13 |
+
.EXAMPLE
|
| 14 |
+
.\deploy_huggingface.ps1 -Username "your-username" -SpaceName "mediguard-ai"
|
| 15 |
+
#>
|
| 16 |
+
|
| 17 |
+
param(
|
| 18 |
+
[Parameter(Mandatory=$true)]
|
| 19 |
+
[string]$Username,
|
| 20 |
+
|
| 21 |
+
[Parameter(Mandatory=$false)]
|
| 22 |
+
[string]$SpaceName = "mediguard-ai",
|
| 23 |
+
|
| 24 |
+
[switch]$SkipClone
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
$ErrorActionPreference = "Stop"
|
| 28 |
+
|
| 29 |
+
Write-Host "========================================" -ForegroundColor Cyan
|
| 30 |
+
Write-Host " MediGuard AI - Hugging Face Deployment" -ForegroundColor Cyan
|
| 31 |
+
Write-Host "========================================" -ForegroundColor Cyan
|
| 32 |
+
Write-Host ""
|
| 33 |
+
|
| 34 |
+
# Configuration
|
| 35 |
+
$ProjectRoot = Split-Path -Parent $PSScriptRoot
|
| 36 |
+
$DeployDir = Join-Path $ProjectRoot "hf-deploy"
|
| 37 |
+
$SpaceUrl = "https://huggingface.co/spaces/$Username/$SpaceName"
|
| 38 |
+
|
| 39 |
+
Write-Host "Project Root: $ProjectRoot" -ForegroundColor Gray
|
| 40 |
+
Write-Host "Deploy Dir: $DeployDir" -ForegroundColor Gray
|
| 41 |
+
Write-Host "Space URL: $SpaceUrl" -ForegroundColor Gray
|
| 42 |
+
Write-Host ""
|
| 43 |
+
|
| 44 |
+
# Step 1: Clone or use existing Space
|
| 45 |
+
if (-not $SkipClone) {
|
| 46 |
+
Write-Host "[1/6] Cloning Hugging Face Space..." -ForegroundColor Yellow
|
| 47 |
+
|
| 48 |
+
if (Test-Path $DeployDir) {
|
| 49 |
+
Write-Host " Removing existing deploy directory..." -ForegroundColor Gray
|
| 50 |
+
Remove-Item -Recurse -Force $DeployDir
|
| 51 |
+
}
|
| 52 |
+
|
| 53 |
+
git clone "https://huggingface.co/spaces/$Username/$SpaceName" $DeployDir
|
| 54 |
+
|
| 55 |
+
if ($LASTEXITCODE -ne 0) {
|
| 56 |
+
Write-Host "ERROR: Failed to clone Space. Make sure it exists!" -ForegroundColor Red
|
| 57 |
+
Write-Host "Create it at: https://huggingface.co/new-space" -ForegroundColor Yellow
|
| 58 |
+
exit 1
|
| 59 |
+
}
|
| 60 |
+
} else {
|
| 61 |
+
Write-Host "[1/6] Using existing deploy directory..." -ForegroundColor Yellow
|
| 62 |
+
}
|
| 63 |
+
|
| 64 |
+
# Step 2: Copy project files
|
| 65 |
+
Write-Host "[2/6] Copying project files..." -ForegroundColor Yellow
|
| 66 |
+
|
| 67 |
+
# Core directories
|
| 68 |
+
$CoreDirs = @("src", "config", "data", "huggingface")
|
| 69 |
+
foreach ($dir in $CoreDirs) {
|
| 70 |
+
$source = Join-Path $ProjectRoot $dir
|
| 71 |
+
$dest = Join-Path $DeployDir $dir
|
| 72 |
+
if (Test-Path $source) {
|
| 73 |
+
Write-Host " Copying $dir..." -ForegroundColor Gray
|
| 74 |
+
Copy-Item -Path $source -Destination $dest -Recurse -Force
|
| 75 |
+
}
|
| 76 |
+
}
|
| 77 |
+
|
| 78 |
+
# Copy specific files
|
| 79 |
+
$CoreFiles = @("pyproject.toml", ".dockerignore")
|
| 80 |
+
foreach ($file in $CoreFiles) {
|
| 81 |
+
$source = Join-Path $ProjectRoot $file
|
| 82 |
+
if (Test-Path $source) {
|
| 83 |
+
Write-Host " Copying $file..." -ForegroundColor Gray
|
| 84 |
+
Copy-Item -Path $source -Destination (Join-Path $DeployDir $file) -Force
|
| 85 |
+
}
|
| 86 |
+
}
|
| 87 |
+
|
| 88 |
+
# Step 3: Set up Dockerfile (HF Spaces expects it in root)
|
| 89 |
+
Write-Host "[3/6] Setting up Dockerfile..." -ForegroundColor Yellow
|
| 90 |
+
$HfDockerfile = Join-Path $DeployDir "huggingface/Dockerfile"
|
| 91 |
+
$RootDockerfile = Join-Path $DeployDir "Dockerfile"
|
| 92 |
+
Copy-Item -Path $HfDockerfile -Destination $RootDockerfile -Force
|
| 93 |
+
Write-Host " Copied huggingface/Dockerfile to Dockerfile" -ForegroundColor Gray
|
| 94 |
+
|
| 95 |
+
# Step 4: Set up README with HF metadata
|
| 96 |
+
Write-Host "[4/6] Setting up README.md..." -ForegroundColor Yellow
|
| 97 |
+
$HfReadme = Join-Path $DeployDir "huggingface/README.md"
|
| 98 |
+
$RootReadme = Join-Path $DeployDir "README.md"
|
| 99 |
+
Copy-Item -Path $HfReadme -Destination $RootReadme -Force
|
| 100 |
+
Write-Host " Copied huggingface/README.md to README.md" -ForegroundColor Gray
|
| 101 |
+
|
| 102 |
+
# Step 5: Verify vector store exists
|
| 103 |
+
Write-Host "[5/6] Verifying vector store..." -ForegroundColor Yellow
|
| 104 |
+
$VectorStore = Join-Path $DeployDir "data/vector_stores/medical_knowledge.faiss"
|
| 105 |
+
if (Test-Path $VectorStore) {
|
| 106 |
+
$size = (Get-Item $VectorStore).Length / 1MB
|
| 107 |
+
Write-Host " Vector store found: $([math]::Round($size, 2)) MB" -ForegroundColor Green
|
| 108 |
+
} else {
|
| 109 |
+
Write-Host " WARNING: Vector store not found!" -ForegroundColor Red
|
| 110 |
+
Write-Host " Run 'python scripts/setup_embeddings.py' first to create it." -ForegroundColor Yellow
|
| 111 |
+
}
|
| 112 |
+
|
| 113 |
+
# Step 6: Commit and push
|
| 114 |
+
Write-Host "[6/6] Committing and pushing to Hugging Face..." -ForegroundColor Yellow
|
| 115 |
+
|
| 116 |
+
Push-Location $DeployDir
|
| 117 |
+
|
| 118 |
+
git add .
|
| 119 |
+
git commit -m "Deploy MediGuard AI - $(Get-Date -Format 'yyyy-MM-dd HH:mm')"
|
| 120 |
+
|
| 121 |
+
Write-Host ""
|
| 122 |
+
Write-Host "Ready to push! Run the following command:" -ForegroundColor Green
|
| 123 |
+
Write-Host ""
|
| 124 |
+
Write-Host " cd $DeployDir" -ForegroundColor Cyan
|
| 125 |
+
Write-Host " git push" -ForegroundColor Cyan
|
| 126 |
+
Write-Host ""
|
| 127 |
+
Write-Host "After pushing, add your API key as a Secret in Space Settings:" -ForegroundColor Yellow
|
| 128 |
+
Write-Host " Name: GROQ_API_KEY (or GOOGLE_API_KEY)" -ForegroundColor Gray
|
| 129 |
+
Write-Host " Value: your-api-key" -ForegroundColor Gray
|
| 130 |
+
Write-Host ""
|
| 131 |
+
Write-Host "Your Space will be live at:" -ForegroundColor Green
|
| 132 |
+
Write-Host " $SpaceUrl" -ForegroundColor Cyan
|
| 133 |
+
|
| 134 |
+
Pop-Location
|
| 135 |
+
|
| 136 |
+
Write-Host ""
|
| 137 |
+
Write-Host "========================================" -ForegroundColor Cyan
|
| 138 |
+
Write-Host " Deployment prepared successfully!" -ForegroundColor Green
|
| 139 |
+
Write-Host "========================================" -ForegroundColor Cyan
|