Nikhil Pravin Pise commited on
Commit
9699bea
·
1 Parent(s): 1e732dd

Deploy to HuggingFace Spaces - Medical RAG with vector store

Browse files
.env.example CHANGED
@@ -32,9 +32,9 @@ OLLAMA__MODEL=llama3.2
32
 
33
  # --- LLM (Groq / Gemini — existing providers) ---
34
  LLM__PRIMARY_PROVIDER=groq
35
- LLM__GROQ_API_KEY=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
36
  LLM__GROQ_MODEL=llama-3.3-70b-versatile
37
- LLM__GEMINI_API_KEY=AIzaSyBbWG-vy44GXuZL-PgNjtvKLXrhdINCgwg
38
  LLM__GEMINI_MODEL=gemini-2.0-flash
39
 
40
  # --- Embeddings ---
 
32
 
33
  # --- LLM (Groq / Gemini — existing providers) ---
34
  LLM__PRIMARY_PROVIDER=groq
35
+ LLM__GROQ_API_KEY=
36
  LLM__GROQ_MODEL=llama-3.3-70b-versatile
37
+ LLM__GEMINI_API_KEY=
38
  LLM__GEMINI_MODEL=gemini-2.0-flash
39
 
40
  # --- Embeddings ---
.gitattributes ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ *.faiss filter=lfs diff=lfs merge=lfs -text
2
+ *.pkl filter=lfs diff=lfs merge=lfs -text
.gitignore CHANGED
@@ -221,10 +221,13 @@ $RECYCLE.BIN/
221
  # Project Specific
222
  # ==============================================================================
223
  # Vector stores (large files, regenerate locally)
 
224
  data/vector_stores/*.faiss
225
  data/vector_stores/*.pkl
226
- *.faiss
227
- *.pkl
 
 
228
 
229
  # Medical PDFs (proprietary/large)
230
  data/medical_pdfs/*.pdf
 
221
  # Project Specific
222
  # ==============================================================================
223
  # Vector stores (large files, regenerate locally)
224
+ # BUT allow medical_knowledge for HuggingFace deployment
225
  data/vector_stores/*.faiss
226
  data/vector_stores/*.pkl
227
+ !data/vector_stores/medical_knowledge.faiss
228
+ !data/vector_stores/medical_knowledge.pkl
229
+ # *.faiss # Commented out to allow medical_knowledge
230
+ # *.pkl # Commented out to allow medical_knowledge
231
 
232
  # Medical PDFs (proprietary/large)
233
  data/medical_pdfs/*.pdf
DEPLOY_HUGGINGFACE.md ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Deploy MediGuard AI to Hugging Face Spaces
2
+
3
+ This guide walks you through deploying MediGuard AI to Hugging Face Spaces using Docker.
4
+
5
+ ## Prerequisites
6
+
7
+ 1. **Hugging Face Account** — [Sign up free](https://huggingface.co/join)
8
+ 2. **Git** — Installed on your machine
9
+ 3. **API Key** — Either:
10
+ - **Groq** (recommended) — [Get free key](https://console.groq.com/keys)
11
+ - **Google Gemini** — [Get free key](https://aistudio.google.com/app/apikey)
12
+
13
+ ## Step 1: Create a New Space
14
+
15
+ 1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
16
+ 2. Fill in:
17
+ - **Space name**: `mediguard-ai` (or your choice)
18
+ - **License**: MIT
19
+ - **SDK**: Select **Docker**
20
+ - **Hardware**: **CPU Basic** (free tier works!)
21
+ 3. Click **Create Space**
22
+
23
+ ## Step 2: Clone Your Space
24
+
25
+ ```bash
26
+ # Clone the empty space
27
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
28
+ cd mediguard-ai
29
+ ```
30
+
31
+ ## Step 3: Copy Project Files
32
+
33
+ Copy all files from this repository to your space folder:
34
+
35
+ ```bash
36
+ # Option A: If you have the RagBot repo locally
37
+ cp -r /path/to/RagBot/* .
38
+
39
+ # Option B: Clone fresh
40
+ git clone https://github.com/yourusername/ragbot temp
41
+ cp -r temp/* .
42
+ rm -rf temp
43
+ ```
44
+
45
+ ## Step 4: Set Up Dockerfile for Spaces
46
+
47
+ Hugging Face Spaces expects the Dockerfile in the root. Copy the HF-optimized Dockerfile:
48
+
49
+ ```bash
50
+ # Copy the HF Spaces Dockerfile to root
51
+ cp huggingface/Dockerfile ./Dockerfile
52
+ ```
53
+
54
+ **Or** update your root `Dockerfile` to match the HF Spaces version.
55
+
56
+ ## Step 5: Set Up README (Important!)
57
+
58
+ The README.md must have the HF Spaces metadata header. Copy the HF README:
59
+
60
+ ```bash
61
+ # Backup original README
62
+ mv README.md README_original.md
63
+
64
+ # Use HF Spaces README
65
+ cp huggingface/README.md ./README.md
66
+ ```
67
+
68
+ ## Step 6: Add Your API Key (Secret)
69
+
70
+ 1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
71
+ 2. Click **Settings** tab
72
+ 3. Scroll to **Repository Secrets**
73
+ 4. Add a new secret:
74
+ - **Name**: `GROQ_API_KEY` (or `GOOGLE_API_KEY`)
75
+ - **Value**: Your API key
76
+ 5. Click **Add**
77
+
78
+ ## Step 7: Push to Deploy
79
+
80
+ ```bash
81
+ # Add all files
82
+ git add .
83
+
84
+ # Commit
85
+ git commit -m "Deploy MediGuard AI"
86
+
87
+ # Push to Hugging Face
88
+ git push
89
+ ```
90
+
91
+ ## Step 8: Monitor Deployment
92
+
93
+ 1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
94
+ 2. Click the **Logs** tab to watch the build
95
+ 3. Build takes ~5-10 minutes (first time)
96
+ 4. Once "Running", your app is live! 🎉
97
+
98
+ ## 🔧 Troubleshooting
99
+
100
+ ### "No LLM API key configured"
101
+
102
+ - Make sure you added `GROQ_API_KEY` or `GOOGLE_API_KEY` in Space Settings → Secrets
103
+ - Secret names are case-sensitive
104
+
105
+ ### Build fails with "No space disk"
106
+
107
+ - Hugging Face free tier has limited disk space
108
+ - The FAISS vector store might be too large
109
+ - Solution: Upgrade to a paid tier or reduce vector store size
110
+
111
+ ### "ModuleNotFoundError"
112
+
113
+ - Check that all dependencies are in `huggingface/requirements.txt`
114
+ - The Dockerfile should install from this file
115
+
116
+ ### App crashes on startup
117
+
118
+ - Check Logs for the actual error
119
+ - Common issue: Missing environment variables
120
+ - Increase Space hardware if OOM error
121
+
122
+ ## 📁 File Structure for Deployment
123
+
124
+ Your Space should have this structure:
125
+
126
+ ```
127
+ your-space/
128
+ ├── Dockerfile # HF Spaces Dockerfile (from huggingface/)
129
+ ├── README.md # HF Spaces README with metadata
130
+ ├── huggingface/
131
+ │ ├── app.py # Standalone Gradio app
132
+ │ ├── requirements.txt # Minimal deps for HF
133
+ │ └── README.md # Original HF README
134
+ ├── src/ # Core application code
135
+ │ ├── workflow.py
136
+ │ ├── state.py
137
+ │ ├── llm_config.py
138
+ │ ├── pdf_processor.py
139
+ │ ├── agents/
140
+ │ └── ...
141
+ ├── data/
142
+ │ └── vector_stores/
143
+ │ ├── medical_knowledge.faiss
144
+ │ └── medical_knowledge.pkl
145
+ └── config/
146
+ └── biomarker_references.json
147
+ ```
148
+
149
+ ## 🔄 Updating Your Space
150
+
151
+ To update after making changes:
152
+
153
+ ```bash
154
+ git add .
155
+ git commit -m "Update: description of changes"
156
+ git push
157
+ ```
158
+
159
+ Hugging Face will automatically rebuild and redeploy.
160
+
161
+ ## 💰 Hardware Options
162
+
163
+ | Tier | RAM | vCPU | Cost | Best For |
164
+ |------|-----|------|------|----------|
165
+ | CPU Basic | 2GB | 2 | Free | Demo/Testing |
166
+ | CPU Upgrade | 8GB | 4 | ~$0.03/hr | Production |
167
+ | T4 Small | 16GB | 4 | ~$0.06/hr | Heavy usage |
168
+
169
+ The free tier works for demos. Upgrade if you experience timeouts.
170
+
171
+ ## 🎉 Your Space is Live!
172
+
173
+ Once deployed, share your Space URL:
174
+
175
+ ```
176
+ https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
177
+ ```
178
+
179
+ Anyone can now use MediGuard AI without any setup!
180
+
181
+ ---
182
+
183
+ ## Quick Commands Reference
184
+
185
+ ```bash
186
+ # Clone your space
187
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
188
+
189
+ # Set up remote (if needed)
190
+ git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai
191
+
192
+ # Push changes
193
+ git push origin main
194
+
195
+ # Force rebuild (if stuck)
196
+ # Go to Settings → Factory Reset
197
+ ```
198
+
199
+ ## Need Help?
200
+
201
+ - [Hugging Face Spaces Docs](https://huggingface.co/docs/hub/spaces)
202
+ - [Docker on Spaces](https://huggingface.co/docs/hub/spaces-sdks-docker)
203
+ - [Spaces Secrets](https://huggingface.co/docs/hub/spaces-secrets)
Dockerfile CHANGED
@@ -1,19 +1,27 @@
1
  # ===========================================================================
2
- # MediGuard AI — Multi-stage Dockerfile
3
  # ===========================================================================
4
- # Build stages:
5
- # base — Python + system deps
6
- # production — slim runtime image
7
  # ===========================================================================
8
 
9
- # ---------------------------------------------------------------------------
10
- # Stage 1: base
11
- # ---------------------------------------------------------------------------
12
- FROM python:3.11-slim AS base
13
 
 
 
 
 
14
  ENV PYTHONDONTWRITEBYTECODE=1 \
15
  PYTHONUNBUFFERED=1 \
16
- PIP_NO_CACHE_DIR=1
 
 
 
 
 
 
 
 
17
 
18
  WORKDIR /app
19
 
@@ -22,45 +30,37 @@ RUN apt-get update && \
22
  apt-get install -y --no-install-recommends \
23
  build-essential \
24
  curl \
 
25
  && rm -rf /var/lib/apt/lists/*
26
 
27
- # Install Python dependencies
28
- COPY pyproject.toml ./
29
  RUN pip install --upgrade pip && \
30
- pip install ".[all]"
31
-
32
- # ---------------------------------------------------------------------------
33
- # Stage 2: production
34
- # ---------------------------------------------------------------------------
35
- FROM python:3.11-slim AS production
36
-
37
- ENV PYTHONDONTWRITEBYTECODE=1 \
38
- PYTHONUNBUFFERED=1
39
 
40
- WORKDIR /app
 
41
 
42
- # Copy installed packages from base
43
- COPY --from=base /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
44
- COPY --from=base /usr/local/bin /usr/local/bin
45
 
46
- # Copy application code
47
- COPY . .
48
 
49
- # Runtime dependencies only
50
- RUN apt-get update && \
51
- apt-get install -y --no-install-recommends curl && \
52
- rm -rf /var/lib/apt/lists/*
53
 
54
- # Create non-root user
55
- RUN groupadd -r mediguard && \
56
- useradd -r -g mediguard -d /app -s /sbin/nologin mediguard && \
57
- chown -R mediguard:mediguard /app
58
 
59
- USER mediguard
60
 
61
- EXPOSE 8000
62
 
63
- HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
64
- CMD curl -sf http://localhost:8000/health || exit 1
 
65
 
66
- CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]
 
 
1
  # ===========================================================================
2
+ # MediGuard AI — Hugging Face Spaces Dockerfile
3
  # ===========================================================================
4
+ # Optimized single-container deployment for Hugging Face Spaces.
5
+ # Uses FAISS vector store + Cloud LLMs (Groq/Gemini) - no external services.
 
6
  # ===========================================================================
7
 
8
+ FROM python:3.11-slim
 
 
 
9
 
10
+ # Non-interactive apt
11
+ ENV DEBIAN_FRONTEND=noninteractive
12
+
13
+ # Python settings
14
  ENV PYTHONDONTWRITEBYTECODE=1 \
15
  PYTHONUNBUFFERED=1 \
16
+ PIP_NO_CACHE_DIR=1 \
17
+ PIP_DISABLE_PIP_VERSION_CHECK=1
18
+
19
+ # HuggingFace Spaces runs on port 7860
20
+ ENV GRADIO_SERVER_NAME="0.0.0.0" \
21
+ GRADIO_SERVER_PORT=7860
22
+
23
+ # Default to HuggingFace embeddings (local, no API key needed)
24
+ ENV EMBEDDING_PROVIDER=huggingface
25
 
26
  WORKDIR /app
27
 
 
30
  apt-get install -y --no-install-recommends \
31
  build-essential \
32
  curl \
33
+ git \
34
  && rm -rf /var/lib/apt/lists/*
35
 
36
+ # Copy requirements first (cache layer)
37
+ COPY huggingface/requirements.txt ./requirements.txt
38
  RUN pip install --upgrade pip && \
39
+ pip install -r requirements.txt
 
 
 
 
 
 
 
 
40
 
41
+ # Copy the entire project
42
+ COPY . .
43
 
44
+ # Create necessary directories and ensure vector store exists
45
+ RUN mkdir -p data/medical_pdfs data/vector_stores data/chat_reports
 
46
 
47
+ # Create non-root user (HF Spaces requirement)
48
+ RUN useradd -m -u 1000 user
49
 
50
+ # Make app writable by user
51
+ RUN chown -R user:user /app
 
 
52
 
53
+ USER user
54
+ ENV HOME=/home/user \
55
+ PATH=/home/user/.local/bin:$PATH
 
56
 
57
+ WORKDIR /app
58
 
59
+ EXPOSE 7860
60
 
61
+ # Health check
62
+ HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
63
+ CMD curl -sf http://localhost:7860/ || exit 1
64
 
65
+ # Launch Gradio app
66
+ CMD ["python", "huggingface/app.py"]
README.md CHANGED
@@ -1,3 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # RagBot: Multi-Agent RAG System for Medical Biomarker Analysis
2
 
3
  A production-ready biomarker analysis system combining 6 specialized AI agents with medical knowledge retrieval to provide evidence-based insights on blood test results in **15-25 seconds**.
 
1
+ ---
2
+ title: Agentic RagBot
3
+ emoji: 🏥
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: docker
7
+ pinned: true
8
+ license: mit
9
+ app_port: 7860
10
+ tags:
11
+ - medical
12
+ - biomarker
13
+ - rag
14
+ - healthcare
15
+ - langgraph
16
+ - agents
17
+ short_description: Multi-Agent RAG System for Medical Biomarker Analysis
18
+ ---
19
+
20
  # RagBot: Multi-Agent RAG System for Medical Biomarker Analysis
21
 
22
  A production-ready biomarker analysis system combining 6 specialized AI agents with medical knowledge retrieval to provide evidence-based insights on blood test results in **15-25 seconds**.
alembic.ini ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # A generic, single database configuration.
2
+
3
+ [alembic]
4
+ # path to migration scripts.
5
+ # this is typically a path given in POSIX (e.g. forward slashes)
6
+ # format, relative to the token %(here)s which refers to the location of this
7
+ # ini file
8
+ script_location = %(here)s/alembic
9
+
10
+ # template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
11
+ # Uncomment the line below if you want the files to be prepended with date and time
12
+ # see https://alembic.sqlalchemy.org/en/latest/tutorial.html#editing-the-ini-file
13
+ # for all available tokens
14
+ # file_template = %%(year)d_%%(month).2d_%%(day).2d_%%(hour).2d%%(minute).2d-%%(rev)s_%%(slug)s
15
+ # Or organize into date-based subdirectories (requires recursive_version_locations = true)
16
+ # file_template = %%(year)d/%%(month).2d/%%(day).2d_%%(hour).2d%%(minute).2d_%%(second).2d_%%(rev)s_%%(slug)s
17
+
18
+ # sys.path path, will be prepended to sys.path if present.
19
+ # defaults to the current working directory. for multiple paths, the path separator
20
+ # is defined by "path_separator" below.
21
+ prepend_sys_path = .
22
+
23
+
24
+ # timezone to use when rendering the date within the migration file
25
+ # as well as the filename.
26
+ # If specified, requires the tzdata library which can be installed by adding
27
+ # `alembic[tz]` to the pip requirements.
28
+ # string value is passed to ZoneInfo()
29
+ # leave blank for localtime
30
+ # timezone =
31
+
32
+ # max length of characters to apply to the "slug" field
33
+ # truncate_slug_length = 40
34
+
35
+ # set to 'true' to run the environment during
36
+ # the 'revision' command, regardless of autogenerate
37
+ # revision_environment = false
38
+
39
+ # set to 'true' to allow .pyc and .pyo files without
40
+ # a source .py file to be detected as revisions in the
41
+ # versions/ directory
42
+ # sourceless = false
43
+
44
+ # version location specification; This defaults
45
+ # to <script_location>/versions. When using multiple version
46
+ # directories, initial revisions must be specified with --version-path.
47
+ # The path separator used here should be the separator specified by "path_separator"
48
+ # below.
49
+ # version_locations = %(here)s/bar:%(here)s/bat:%(here)s/alembic/versions
50
+
51
+ # path_separator; This indicates what character is used to split lists of file
52
+ # paths, including version_locations and prepend_sys_path within configparser
53
+ # files such as alembic.ini.
54
+ # The default rendered in new alembic.ini files is "os", which uses os.pathsep
55
+ # to provide os-dependent path splitting.
56
+ #
57
+ # Note that in order to support legacy alembic.ini files, this default does NOT
58
+ # take place if path_separator is not present in alembic.ini. If this
59
+ # option is omitted entirely, fallback logic is as follows:
60
+ #
61
+ # 1. Parsing of the version_locations option falls back to using the legacy
62
+ # "version_path_separator" key, which if absent then falls back to the legacy
63
+ # behavior of splitting on spaces and/or commas.
64
+ # 2. Parsing of the prepend_sys_path option falls back to the legacy
65
+ # behavior of splitting on spaces, commas, or colons.
66
+ #
67
+ # Valid values for path_separator are:
68
+ #
69
+ # path_separator = :
70
+ # path_separator = ;
71
+ # path_separator = space
72
+ # path_separator = newline
73
+ #
74
+ # Use os.pathsep. Default configuration used for new projects.
75
+ path_separator = os
76
+
77
+ # set to 'true' to search source files recursively
78
+ # in each "version_locations" directory
79
+ # new in Alembic version 1.10
80
+ # recursive_version_locations = false
81
+
82
+ # the output encoding used when revision files
83
+ # are written from script.py.mako
84
+ # output_encoding = utf-8
85
+
86
+ # database URL. This is consumed by the user-maintained env.py script only.
87
+ # other means of configuring database URLs may be customized within the env.py
88
+ # file.
89
+ sqlalchemy.url = driver://user:pass@localhost/dbname
90
+
91
+
92
+ [post_write_hooks]
93
+ # post_write_hooks defines scripts or Python functions that are run
94
+ # on newly generated revision scripts. See the documentation for further
95
+ # detail and examples
96
+
97
+ # format using "black" - use the console_scripts runner, against the "black" entrypoint
98
+ # hooks = black
99
+ # black.type = console_scripts
100
+ # black.entrypoint = black
101
+ # black.options = -l 79 REVISION_SCRIPT_FILENAME
102
+
103
+ # lint with attempts to fix using "ruff" - use the module runner, against the "ruff" module
104
+ # hooks = ruff
105
+ # ruff.type = module
106
+ # ruff.module = ruff
107
+ # ruff.options = check --fix REVISION_SCRIPT_FILENAME
108
+
109
+ # Alternatively, use the exec runner to execute a binary found on your PATH
110
+ # hooks = ruff
111
+ # ruff.type = exec
112
+ # ruff.executable = ruff
113
+ # ruff.options = check --fix REVISION_SCRIPT_FILENAME
114
+
115
+ # Logging configuration. This is also consumed by the user-maintained
116
+ # env.py script only.
117
+ [loggers]
118
+ keys = root,sqlalchemy,alembic
119
+
120
+ [handlers]
121
+ keys = console
122
+
123
+ [formatters]
124
+ keys = generic
125
+
126
+ [logger_root]
127
+ level = WARNING
128
+ handlers = console
129
+ qualname =
130
+
131
+ [logger_sqlalchemy]
132
+ level = WARNING
133
+ handlers =
134
+ qualname = sqlalchemy.engine
135
+
136
+ [logger_alembic]
137
+ level = INFO
138
+ handlers =
139
+ qualname = alembic
140
+
141
+ [handler_console]
142
+ class = StreamHandler
143
+ args = (sys.stderr,)
144
+ level = NOTSET
145
+ formatter = generic
146
+
147
+ [formatter_generic]
148
+ format = %(levelname)-5.5s [%(name)s] %(message)s
149
+ datefmt = %H:%M:%S
alembic/README ADDED
@@ -0,0 +1 @@
 
 
1
+ Generic single-database configuration.
alembic/env.py ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from logging.config import fileConfig
2
+
3
+ from sqlalchemy import engine_from_config
4
+ from sqlalchemy import pool, create_engine
5
+
6
+ from alembic import context
7
+
8
+ # ---------------------------------------------------------------------------
9
+ # MediGuard AI — Alembic env.py
10
+ # Pull DB URL from settings so we never hard-code credentials.
11
+ # ---------------------------------------------------------------------------
12
+ import sys
13
+ import os
14
+
15
+ # Make sure the project root is on sys.path
16
+ sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))
17
+
18
+ from src.settings import get_settings # noqa: E402
19
+ from src.database import Base # noqa: E402
20
+
21
+ # Import all models so Alembic's autogenerate can see them
22
+ import src.models.analysis # noqa: F401, E402
23
+
24
+ # this is the Alembic Config object, which provides
25
+ # access to the values within the .ini file in use.
26
+ config = context.config
27
+
28
+ # Interpret the config file for Python logging.
29
+ # This line sets up loggers basically.
30
+ if config.config_file_name is not None:
31
+ fileConfig(config.config_file_name)
32
+
33
+ # Override sqlalchemy.url from our Pydantic Settings
34
+ _settings = get_settings()
35
+ config.set_main_option("sqlalchemy.url", _settings.postgres.database_url)
36
+
37
+ # Metadata used for autogenerate
38
+ target_metadata = Base.metadata
39
+
40
+ # other values from the config, defined by the needs of env.py,
41
+ # can be acquired:
42
+ # my_important_option = config.get_main_option("my_important_option")
43
+ # ... etc.
44
+
45
+
46
+ def run_migrations_offline() -> None:
47
+ """Run migrations in 'offline' mode.
48
+
49
+ This configures the context with just a URL
50
+ and not an Engine, though an Engine is acceptable
51
+ here as well. By skipping the Engine creation
52
+ we don't even need a DBAPI to be available.
53
+
54
+ Calls to context.execute() here emit the given string to the
55
+ script output.
56
+
57
+ """
58
+ url = config.get_main_option("sqlalchemy.url")
59
+ context.configure(
60
+ url=url,
61
+ target_metadata=target_metadata,
62
+ literal_binds=True,
63
+ dialect_opts={"paramstyle": "named"},
64
+ )
65
+
66
+ with context.begin_transaction():
67
+ context.run_migrations()
68
+
69
+
70
+ def run_migrations_online() -> None:
71
+ """Run migrations in 'online' mode.
72
+
73
+ In this scenario we need to create an Engine
74
+ and associate a connection with the context.
75
+
76
+ """
77
+ connectable = engine_from_config(
78
+ config.get_section(config.config_ini_section, {}),
79
+ prefix="sqlalchemy.",
80
+ poolclass=pool.NullPool,
81
+ )
82
+
83
+ with connectable.connect() as connection:
84
+ context.configure(
85
+ connection=connection, target_metadata=target_metadata
86
+ )
87
+
88
+ with context.begin_transaction():
89
+ context.run_migrations()
90
+
91
+
92
+ if context.is_offline_mode():
93
+ run_migrations_offline()
94
+ else:
95
+ run_migrations_online()
alembic/script.py.mako ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """${message}
2
+
3
+ Revision ID: ${up_revision}
4
+ Revises: ${down_revision | comma,n}
5
+ Create Date: ${create_date}
6
+
7
+ """
8
+ from typing import Sequence, Union
9
+
10
+ from alembic import op
11
+ import sqlalchemy as sa
12
+ ${imports if imports else ""}
13
+
14
+ # revision identifiers, used by Alembic.
15
+ revision: str = ${repr(up_revision)}
16
+ down_revision: Union[str, Sequence[str], None] = ${repr(down_revision)}
17
+ branch_labels: Union[str, Sequence[str], None] = ${repr(branch_labels)}
18
+ depends_on: Union[str, Sequence[str], None] = ${repr(depends_on)}
19
+
20
+
21
+ def upgrade() -> None:
22
+ """Upgrade schema."""
23
+ ${upgrades if upgrades else "pass"}
24
+
25
+
26
+ def downgrade() -> None:
27
+ """Downgrade schema."""
28
+ ${downgrades if downgrades else "pass"}
data/vector_stores/medical_knowledge.faiss ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e9dee84846c00eda0f0a5487b61c2dd9cc85588ee0cbbcb576df24e8881969e1
3
+ size 4007469
data/vector_stores/medical_knowledge.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:690fa693a48c3eb5e0a1fc11b7008a9037630928d9c8a634a31e7f90d8e2f7fb
3
+ size 2727206
docker-compose.yml CHANGED
@@ -76,12 +76,13 @@ services:
76
  restart: unless-stopped
77
 
78
  opensearch:
79
- image: opensearchproject/opensearch:2.19.0
80
  container_name: mediguard-opensearch
81
  environment:
82
  - discovery.type=single-node
83
  - DISABLE_SECURITY_PLUGIN=true
84
- - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
 
85
  - bootstrap.memory_lock=true
86
  ulimits:
87
  memlock: { soft: -1, hard: -1 }
@@ -94,21 +95,22 @@ services:
94
  test: ["CMD-SHELL", "curl -sf http://localhost:9200/_cluster/health || exit 1"]
95
  interval: 10s
96
  timeout: 5s
97
- retries: 20
98
  restart: unless-stopped
99
 
100
- opensearch-dashboards:
101
- image: opensearchproject/opensearch-dashboards:2.19.0
102
- container_name: mediguard-os-dashboards
103
- environment:
104
- - OPENSEARCH_HOSTS=["http://opensearch:9200"]
105
- - DISABLE_SECURITY_DASHBOARDS_PLUGIN=true
106
- ports:
107
- - "${OS_DASHBOARDS_PORT:-5601}:5601"
108
- depends_on:
109
- opensearch:
110
- condition: service_healthy
111
- restart: unless-stopped
 
112
 
113
  redis:
114
  image: redis:7-alpine
 
76
  restart: unless-stopped
77
 
78
  opensearch:
79
+ image: opensearchproject/opensearch:2.11.1
80
  container_name: mediguard-opensearch
81
  environment:
82
  - discovery.type=single-node
83
  - DISABLE_SECURITY_PLUGIN=true
84
+ - plugins.security.disabled=true
85
+ - "OPENSEARCH_JAVA_OPTS=-Xms256m -Xmx256m"
86
  - bootstrap.memory_lock=true
87
  ulimits:
88
  memlock: { soft: -1, hard: -1 }
 
95
  test: ["CMD-SHELL", "curl -sf http://localhost:9200/_cluster/health || exit 1"]
96
  interval: 10s
97
  timeout: 5s
98
+ retries: 24
99
  restart: unless-stopped
100
 
101
+ # opensearch-dashboards: disabled by default — uncomment if you need the UI
102
+ # opensearch-dashboards:
103
+ # image: opensearchproject/opensearch-dashboards:2.11.1
104
+ # container_name: mediguard-os-dashboards
105
+ # environment:
106
+ # - OPENSEARCH_HOSTS=["http://opensearch:9200"]
107
+ # - DISABLE_SECURITY_DASHBOARDS_PLUGIN=true
108
+ # ports:
109
+ # - "${OS_DASHBOARDS_PORT:-5601}:5601"
110
+ # depends_on:
111
+ # opensearch:
112
+ # condition: service_healthy
113
+ # restart: unless-stopped
114
 
115
  redis:
116
  image: redis:7-alpine
huggingface/Dockerfile ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ===========================================================================
2
+ # MediGuard AI — Hugging Face Spaces Dockerfile
3
+ # ===========================================================================
4
+ # Optimized single-container deployment for Hugging Face Spaces.
5
+ # Uses FAISS vector store + Cloud LLMs (Groq/Gemini) - no external services.
6
+ # ===========================================================================
7
+
8
+ FROM python:3.11-slim
9
+
10
+ # Non-interactive apt
11
+ ENV DEBIAN_FRONTEND=noninteractive
12
+
13
+ # Python settings
14
+ ENV PYTHONDONTWRITEBYTECODE=1 \
15
+ PYTHONUNBUFFERED=1 \
16
+ PIP_NO_CACHE_DIR=1 \
17
+ PIP_DISABLE_PIP_VERSION_CHECK=1
18
+
19
+ # HuggingFace Spaces runs on port 7860
20
+ ENV GRADIO_SERVER_NAME="0.0.0.0" \
21
+ GRADIO_SERVER_PORT=7860
22
+
23
+ # Default to HuggingFace embeddings (local, no API key needed)
24
+ ENV EMBEDDING_PROVIDER=huggingface
25
+
26
+ WORKDIR /app
27
+
28
+ # System dependencies
29
+ RUN apt-get update && \
30
+ apt-get install -y --no-install-recommends \
31
+ build-essential \
32
+ curl \
33
+ git \
34
+ && rm -rf /var/lib/apt/lists/*
35
+
36
+ # Copy requirements first (cache layer)
37
+ COPY huggingface/requirements.txt ./requirements.txt
38
+ RUN pip install --upgrade pip && \
39
+ pip install -r requirements.txt
40
+
41
+ # Copy the entire project
42
+ COPY . .
43
+
44
+ # Create necessary directories and ensure vector store exists
45
+ RUN mkdir -p data/medical_pdfs data/vector_stores data/chat_reports
46
+
47
+ # Create non-root user (HF Spaces requirement)
48
+ RUN useradd -m -u 1000 user
49
+
50
+ # Make app writable by user
51
+ RUN chown -R user:user /app
52
+
53
+ USER user
54
+ ENV HOME=/home/user \
55
+ PATH=/home/user/.local/bin:$PATH
56
+
57
+ WORKDIR /app
58
+
59
+ EXPOSE 7860
60
+
61
+ # Health check
62
+ HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
63
+ CMD curl -sf http://localhost:7860/ || exit 1
64
+
65
+ # Launch Gradio app
66
+ CMD ["python", "huggingface/app.py"]
huggingface/README.md ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: MediGuard AI
3
+ emoji: 🏥
4
+ colorFrom: blue
5
+ colorTo: cyan
6
+ sdk: docker
7
+ pinned: true
8
+ license: mit
9
+ app_port: 7860
10
+ models:
11
+ - meta-llama/Llama-3.3-70B-Versatile
12
+ tags:
13
+ - medical
14
+ - biomarker
15
+ - rag
16
+ - healthcare
17
+ - langgraph
18
+ - agents
19
+ short_description: Multi-Agent RAG System for Medical Biomarker Analysis
20
+ ---
21
+
22
+ # 🏥 MediGuard AI — Medical Biomarker Analysis
23
+
24
+ A production-ready **Multi-Agent RAG System** that analyzes blood test biomarkers using 6 specialized AI agents with medical knowledge retrieval.
25
+
26
+ ## ✨ Features
27
+
28
+ - **6 Specialist AI Agents** — Biomarker validation, disease prediction, RAG-powered analysis, confidence assessment
29
+ - **Medical Knowledge Base** — 750+ pages of clinical guidelines (FAISS vector store)
30
+ - **Evidence-Based** — All recommendations backed by retrieved medical literature
31
+ - **Free Cloud LLMs** — Uses Groq (LLaMA 3.3-70B) or Google Gemini
32
+
33
+ ## 🚀 Quick Start
34
+
35
+ 1. **Enter your biomarkers** in any format:
36
+ - `Glucose: 140, HbA1c: 7.5`
37
+ - `My glucose is 140 and HbA1c is 7.5`
38
+ - `{"Glucose": 140, "HbA1c": 7.5}`
39
+
40
+ 2. **Click Analyze** and get:
41
+ - Primary diagnosis with confidence score
42
+ - Critical alerts and safety flags
43
+ - Biomarker analysis with normal ranges
44
+ - Evidence-based recommendations
45
+ - Disease pathophysiology explanation
46
+
47
+ ## 🔧 Configuration
48
+
49
+ This Space requires an LLM API key. Add one of these secrets in Space Settings:
50
+
51
+ | Secret | Provider | Get Free Key |
52
+ |--------|----------|--------------|
53
+ | `GROQ_API_KEY` | Groq (recommended) | [console.groq.com/keys](https://console.groq.com/keys) |
54
+ | `GOOGLE_API_KEY` | Google Gemini | [aistudio.google.com](https://aistudio.google.com/app/apikey) |
55
+
56
+ ## 🏗️ Architecture
57
+
58
+ ```
59
+ ┌─────────────────────────────────────────────────────────┐
60
+ │ Clinical Insight Guild │
61
+ ├─────────────────────────────────────────────────────────┤
62
+ │ ┌───────────────────────────────────────────────────┐ │
63
+ │ │ 1. Biomarker Analyzer │ │
64
+ │ │ Validates values, flags abnormalities │ │
65
+ │ └───────────────────┬───────────────────────────────┘ │
66
+ │ │ │
67
+ │ ┌────────────┼────────────┐ │
68
+ │ ▼ ▼ ▼ │
69
+ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
70
+ │ │ Disease │ │Biomarker │ │ Clinical │ │
71
+ │ │Explainer │ │ Linker │ │Guidelines│ │
72
+ │ │ (RAG) │ │ │ │ (RAG) │ │
73
+ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
74
+ │ │ │ │ │
75
+ │ └────────────┼────────────┘ │
76
+ │ ▼ │
77
+ │ ┌───────────────────────────────────────────────────┐ │
78
+ │ │ 4. Confidence Assessor │ │
79
+ │ │ Evaluates reliability, assigns scores │ │
80
+ │ └───────────────────┬───────────────────────────────┘ │
81
+ │ ▼ │
82
+ │ ┌───────────────────────────────────────────────────┐ │
83
+ │ │ 5. Response Synthesizer │ │
84
+ │ │ Compiles patient-friendly summary │ │
85
+ │ └───────────────────────────────────────────────────┘ │
86
+ └─────────────────────────────────────────────────────────┘
87
+ ```
88
+
89
+ ## 📊 Supported Biomarkers
90
+
91
+ | Category | Biomarkers |
92
+ |----------|------------|
93
+ | **Diabetes** | Glucose, HbA1c, Fasting Glucose, Insulin |
94
+ | **Lipids** | Cholesterol, LDL, HDL, Triglycerides |
95
+ | **Kidney** | Creatinine, BUN, eGFR |
96
+ | **Liver** | ALT, AST, Bilirubin, Albumin |
97
+ | **Thyroid** | TSH, T3, T4, Free T4 |
98
+ | **Blood** | Hemoglobin, WBC, RBC, Platelets |
99
+ | **Cardiac** | Troponin, BNP, CRP |
100
+
101
+ ## ⚠️ Medical Disclaimer
102
+
103
+ This tool is for **informational purposes only** and does not replace professional medical advice, diagnosis, or treatment. Always consult a qualified healthcare provider with questions regarding a medical condition.
104
+
105
+ ## 📄 License
106
+
107
+ MIT License — See [GitHub Repository](https://github.com/yourusername/ragbot) for details.
108
+
109
+ ## 🙏 Acknowledgments
110
+
111
+ Built with [LangGraph](https://langchain-ai.github.io/langgraph/), [FAISS](https://faiss.ai/), [Gradio](https://gradio.app/), and [Groq](https://groq.com/).
huggingface/app.py ADDED
@@ -0,0 +1,532 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ MediGuard AI — Hugging Face Spaces Gradio App
3
+
4
+ Standalone deployment that uses:
5
+ - FAISS vector store (local)
6
+ - Cloud LLMs (Groq or Gemini - FREE tiers)
7
+ - No external services required
8
+ """
9
+
10
+ from __future__ import annotations
11
+
12
+ import json
13
+ import logging
14
+ import os
15
+ import sys
16
+ import time
17
+ import traceback
18
+ from pathlib import Path
19
+ from typing import Any, Optional
20
+
21
+ # Ensure project root is in path
22
+ _project_root = str(Path(__file__).parent.parent)
23
+ if _project_root not in sys.path:
24
+ sys.path.insert(0, _project_root)
25
+ os.chdir(_project_root)
26
+
27
+ import gradio as gr
28
+
29
+ logging.basicConfig(
30
+ level=logging.INFO,
31
+ format="%(asctime)s | %(name)-20s | %(levelname)-7s | %(message)s",
32
+ )
33
+ logger = logging.getLogger("mediguard.huggingface")
34
+
35
+ # ---------------------------------------------------------------------------
36
+ # Configuration
37
+ # ---------------------------------------------------------------------------
38
+
39
+ # Check for required API keys
40
+ GROQ_API_KEY = os.getenv("GROQ_API_KEY", "")
41
+ GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY", "")
42
+
43
+ if not GROQ_API_KEY and not GOOGLE_API_KEY:
44
+ logger.warning(
45
+ "No LLM API key found. Set GROQ_API_KEY or GOOGLE_API_KEY environment variable."
46
+ )
47
+
48
+ # Set default provider based on available keys
49
+ if GROQ_API_KEY:
50
+ os.environ.setdefault("LLM_PROVIDER", "groq")
51
+ elif GOOGLE_API_KEY:
52
+ os.environ.setdefault("LLM_PROVIDER", "gemini")
53
+
54
+
55
+ # ---------------------------------------------------------------------------
56
+ # Guild Initialization (lazy)
57
+ # ---------------------------------------------------------------------------
58
+
59
+ _guild = None
60
+ _guild_error = None
61
+
62
+
63
+ def get_guild():
64
+ """Lazy initialization of the Clinical Insight Guild."""
65
+ global _guild, _guild_error
66
+
67
+ if _guild is not None:
68
+ return _guild
69
+
70
+ if _guild_error is not None:
71
+ raise _guild_error
72
+
73
+ try:
74
+ logger.info("Initializing Clinical Insight Guild...")
75
+ start = time.time()
76
+
77
+ from src.workflow import create_guild
78
+ _guild = create_guild()
79
+
80
+ elapsed = time.time() - start
81
+ logger.info(f"Guild initialized in {elapsed:.1f}s")
82
+ return _guild
83
+
84
+ except Exception as exc:
85
+ logger.error(f"Failed to initialize guild: {exc}")
86
+ _guild_error = exc
87
+ raise
88
+
89
+
90
+ # ---------------------------------------------------------------------------
91
+ # Analysis Functions
92
+ # ---------------------------------------------------------------------------
93
+
94
+ def parse_biomarkers(text: str) -> dict[str, float]:
95
+ """
96
+ Parse biomarkers from natural language text.
97
+
98
+ Supports formats like:
99
+ - "Glucose: 140, HbA1c: 7.5"
100
+ - "glucose 140 hba1c 7.5"
101
+ - {"Glucose": 140, "HbA1c": 7.5}
102
+ """
103
+ text = text.strip()
104
+
105
+ # Try JSON first
106
+ if text.startswith("{"):
107
+ try:
108
+ return json.loads(text)
109
+ except json.JSONDecodeError:
110
+ pass
111
+
112
+ # Parse natural language
113
+ import re
114
+
115
+ # Common biomarker patterns
116
+ patterns = [
117
+ # "Glucose: 140" or "Glucose = 140"
118
+ r"([A-Za-z0-9_]+)\s*[:=]\s*([\d.]+)",
119
+ # "Glucose 140 mg/dL"
120
+ r"([A-Za-z0-9_]+)\s+([\d.]+)\s*(?:mg/dL|mmol/L|%|g/dL|U/L|mIU/L)?",
121
+ ]
122
+
123
+ biomarkers = {}
124
+
125
+ for pattern in patterns:
126
+ matches = re.findall(pattern, text, re.IGNORECASE)
127
+ for name, value in matches:
128
+ try:
129
+ biomarkers[name.strip()] = float(value)
130
+ except ValueError:
131
+ continue
132
+
133
+ return biomarkers
134
+
135
+
136
+ def analyze_biomarkers(input_text: str, progress=gr.Progress()) -> tuple[str, str, str]:
137
+ """
138
+ Analyze biomarkers using the Clinical Insight Guild.
139
+
140
+ Returns: (summary, details_json, status)
141
+ """
142
+ if not input_text.strip():
143
+ return "", "", "⚠️ Please enter biomarkers to analyze."
144
+
145
+ # Check API key
146
+ if not GROQ_API_KEY and not GOOGLE_API_KEY:
147
+ return "", "", (
148
+ "❌ **Error**: No LLM API key configured.\n\n"
149
+ "Please add your API key in Hugging Face Space Settings → Secrets:\n"
150
+ "- `GROQ_API_KEY` (get free at https://console.groq.com/keys)\n"
151
+ "- or `GOOGLE_API_KEY` (get free at https://aistudio.google.com/app/apikey)"
152
+ )
153
+
154
+ try:
155
+ progress(0.1, desc="Parsing biomarkers...")
156
+ biomarkers = parse_biomarkers(input_text)
157
+
158
+ if not biomarkers:
159
+ return "", "", (
160
+ "⚠️ Could not parse biomarkers. Try formats like:\n"
161
+ "• `Glucose: 140, HbA1c: 7.5`\n"
162
+ "• `{\"Glucose\": 140, \"HbA1c\": 7.5}`"
163
+ )
164
+
165
+ progress(0.2, desc="Initializing analysis...")
166
+
167
+ # Initialize guild
168
+ guild = get_guild()
169
+
170
+ # Prepare input
171
+ from src.state import PatientInput
172
+
173
+ # Auto-generate prediction based on common patterns
174
+ prediction = auto_predict(biomarkers)
175
+
176
+ patient_input = PatientInput(
177
+ biomarkers=biomarkers,
178
+ model_prediction=prediction,
179
+ patient_context={"patient_id": "HF_User", "source": "huggingface_spaces"}
180
+ )
181
+
182
+ progress(0.4, desc="Running Clinical Insight Guild...")
183
+
184
+ # Run analysis
185
+ start = time.time()
186
+ result = guild.run(patient_input)
187
+ elapsed = time.time() - start
188
+
189
+ progress(0.9, desc="Formatting results...")
190
+
191
+ # Extract response
192
+ final_response = result.get("final_response", {})
193
+
194
+ # Format summary
195
+ summary = format_summary(final_response, elapsed)
196
+
197
+ # Format details
198
+ details = json.dumps(final_response, indent=2, default=str)
199
+
200
+ status = f"✅ Analysis completed in {elapsed:.1f}s"
201
+
202
+ return summary, details, status
203
+
204
+ except Exception as exc:
205
+ logger.error(f"Analysis error: {exc}", exc_info=True)
206
+ return "", "", f"❌ **Error**: {exc}\n\n```\n{traceback.format_exc()}\n```"
207
+
208
+
209
+ def auto_predict(biomarkers: dict[str, float]) -> dict[str, Any]:
210
+ """
211
+ Auto-generate a disease prediction based on biomarkers.
212
+ This simulates what an ML model would provide.
213
+ """
214
+ # Normalize biomarker names for matching
215
+ normalized = {k.lower().replace(" ", ""): v for k, v in biomarkers.items()}
216
+
217
+ # Check for diabetes indicators
218
+ glucose = normalized.get("glucose", normalized.get("fastingglucose", 0))
219
+ hba1c = normalized.get("hba1c", normalized.get("hemoglobina1c", 0))
220
+
221
+ if hba1c >= 6.5 or glucose >= 126:
222
+ return {
223
+ "disease": "Diabetes",
224
+ "confidence": min(0.95, 0.7 + (hba1c - 6.5) * 0.1) if hba1c else 0.85,
225
+ "severity": "high" if hba1c >= 8 or glucose >= 200 else "moderate"
226
+ }
227
+
228
+ # Check for lipid disorders
229
+ cholesterol = normalized.get("cholesterol", normalized.get("totalcholesterol", 0))
230
+ ldl = normalized.get("ldl", normalized.get("ldlcholesterol", 0))
231
+ triglycerides = normalized.get("triglycerides", 0)
232
+
233
+ if cholesterol >= 240 or ldl >= 160 or triglycerides >= 200:
234
+ return {
235
+ "disease": "Dyslipidemia",
236
+ "confidence": 0.85,
237
+ "severity": "moderate"
238
+ }
239
+
240
+ # Check for anemia
241
+ hemoglobin = normalized.get("hemoglobin", normalized.get("hgb", normalized.get("hb", 0)))
242
+
243
+ if hemoglobin and hemoglobin < 12:
244
+ return {
245
+ "disease": "Anemia",
246
+ "confidence": 0.80,
247
+ "severity": "moderate"
248
+ }
249
+
250
+ # Check for thyroid issues
251
+ tsh = normalized.get("tsh", 0)
252
+
253
+ if tsh > 4.5:
254
+ return {
255
+ "disease": "Hypothyroidism",
256
+ "confidence": 0.75,
257
+ "severity": "moderate"
258
+ }
259
+ elif tsh and tsh < 0.4:
260
+ return {
261
+ "disease": "Hyperthyroidism",
262
+ "confidence": 0.75,
263
+ "severity": "moderate"
264
+ }
265
+
266
+ # Default - general health screening
267
+ return {
268
+ "disease": "General Health Screening",
269
+ "confidence": 0.70,
270
+ "severity": "low"
271
+ }
272
+
273
+
274
+ def format_summary(response: dict, elapsed: float) -> str:
275
+ """Format the analysis response as readable markdown."""
276
+ if not response:
277
+ return "No analysis results available."
278
+
279
+ parts = []
280
+
281
+ # Header
282
+ primary = response.get("primary_finding", "Analysis")
283
+ confidence = response.get("confidence", {})
284
+ conf_score = confidence.get("overall_score", 0) if isinstance(confidence, dict) else 0
285
+
286
+ parts.append(f"## 🏥 {primary}")
287
+ if conf_score:
288
+ parts.append(f"**Confidence**: {conf_score:.0%}")
289
+ parts.append("")
290
+
291
+ # Critical Alerts
292
+ alerts = response.get("safety_alerts", [])
293
+ if alerts:
294
+ parts.append("### ⚠️ Critical Alerts")
295
+ for alert in alerts[:5]:
296
+ if isinstance(alert, dict):
297
+ parts.append(f"- **{alert.get('alert_type', 'Alert')}**: {alert.get('message', '')}")
298
+ else:
299
+ parts.append(f"- {alert}")
300
+ parts.append("")
301
+
302
+ # Key Findings
303
+ findings = response.get("key_findings", [])
304
+ if findings:
305
+ parts.append("### 🔍 Key Findings")
306
+ for finding in findings[:5]:
307
+ parts.append(f"- {finding}")
308
+ parts.append("")
309
+
310
+ # Biomarker Flags
311
+ flags = response.get("biomarker_flags", [])
312
+ if flags:
313
+ parts.append("### 📊 Biomarker Analysis")
314
+ for flag in flags[:8]:
315
+ if isinstance(flag, dict):
316
+ name = flag.get("biomarker", "Unknown")
317
+ status = flag.get("status", "normal")
318
+ value = flag.get("value", "N/A")
319
+ emoji = "🔴" if status == "critical" else "🟡" if status == "abnormal" else "🟢"
320
+ parts.append(f"- {emoji} **{name}**: {value} ({status})")
321
+ else:
322
+ parts.append(f"- {flag}")
323
+ parts.append("")
324
+
325
+ # Recommendations
326
+ recs = response.get("recommendations", {})
327
+ if recs:
328
+ parts.append("### 💡 Recommendations")
329
+
330
+ immediate = recs.get("immediate_actions", [])
331
+ if immediate:
332
+ parts.append("**Immediate Actions:**")
333
+ for action in immediate[:3]:
334
+ parts.append(f"- {action}")
335
+
336
+ lifestyle = recs.get("lifestyle_modifications", [])
337
+ if lifestyle:
338
+ parts.append("\n**Lifestyle Modifications:**")
339
+ for mod in lifestyle[:3]:
340
+ parts.append(f"- {mod}")
341
+
342
+ followup = recs.get("follow_up", [])
343
+ if followup:
344
+ parts.append("\n**Follow-up:**")
345
+ for item in followup[:3]:
346
+ parts.append(f"- {item}")
347
+ parts.append("")
348
+
349
+ # Disease Explanation
350
+ explanation = response.get("disease_explanation", {})
351
+ if explanation and isinstance(explanation, dict):
352
+ parts.append("### 📖 Understanding Your Results")
353
+
354
+ pathophys = explanation.get("pathophysiology", "")
355
+ if pathophys:
356
+ parts.append(f"{pathophys[:500]}...")
357
+ parts.append("")
358
+
359
+ # Conversational Summary
360
+ conv_summary = response.get("conversational_summary", "")
361
+ if conv_summary:
362
+ parts.append("### 📝 Summary")
363
+ parts.append(conv_summary[:1000])
364
+ parts.append("")
365
+
366
+ # Footer
367
+ parts.append("---")
368
+ parts.append(f"*Analysis completed in {elapsed:.1f}s using MediGuard AI*")
369
+ parts.append("")
370
+ parts.append("**⚠️ Disclaimer**: This is for informational purposes only. "
371
+ "Consult a healthcare professional for medical advice.")
372
+
373
+ return "\n".join(parts)
374
+
375
+
376
+ # ---------------------------------------------------------------------------
377
+ # Gradio Interface
378
+ # ---------------------------------------------------------------------------
379
+
380
+ def create_demo() -> gr.Blocks:
381
+ """Create the Gradio Blocks interface."""
382
+
383
+ with gr.Blocks(
384
+ title="MediGuard AI - Medical Biomarker Analysis",
385
+ theme=gr.themes.Soft(primary_hue="blue", secondary_hue="cyan"),
386
+ css="""
387
+ .gradio-container { max-width: 1200px !important; }
388
+ .status-box { font-size: 14px; }
389
+ footer { display: none !important; }
390
+ """
391
+ ) as demo:
392
+
393
+ # Header
394
+ gr.Markdown("""
395
+ # 🏥 MediGuard AI — Medical Biomarker Analysis
396
+
397
+ **Multi-Agent RAG System** powered by 6 specialized AI agents with medical knowledge retrieval.
398
+
399
+ Enter your biomarkers below and get evidence-based insights in seconds.
400
+ """)
401
+
402
+ # API Key warning (if needed)
403
+ if not GROQ_API_KEY and not GOOGLE_API_KEY:
404
+ gr.Markdown("""
405
+ <div style="background: #ffeeba; padding: 10px; border-radius: 5px; margin: 10px 0;">
406
+ ⚠️ <b>API Key Required</b>: Add <code>GROQ_API_KEY</code> or <code>GOOGLE_API_KEY</code>
407
+ in Space Settings → Secrets to enable analysis.
408
+ </div>
409
+ """)
410
+
411
+ with gr.Row():
412
+ # Input column
413
+ with gr.Column(scale=1):
414
+ gr.Markdown("### 📝 Enter Biomarkers")
415
+
416
+ input_text = gr.Textbox(
417
+ label="Biomarkers",
418
+ placeholder=(
419
+ "Enter biomarkers in any format:\n"
420
+ "• Glucose: 140, HbA1c: 7.5, Cholesterol: 210\n"
421
+ "• My glucose is 140 and HbA1c is 7.5\n"
422
+ '• {"Glucose": 140, "HbA1c": 7.5}'
423
+ ),
424
+ lines=5,
425
+ max_lines=10,
426
+ )
427
+
428
+ with gr.Row():
429
+ analyze_btn = gr.Button("🔬 Analyze", variant="primary", size="lg")
430
+ clear_btn = gr.Button("🗑️ Clear", size="lg")
431
+
432
+ status_output = gr.Markdown(
433
+ label="Status",
434
+ elem_classes="status-box"
435
+ )
436
+
437
+ # Example inputs
438
+ gr.Markdown("### 📋 Example Inputs")
439
+
440
+ examples = gr.Examples(
441
+ examples=[
442
+ ["Glucose: 185, HbA1c: 8.2, Cholesterol: 245, LDL: 165"],
443
+ ["Glucose: 95, HbA1c: 5.4, Cholesterol: 180, HDL: 55, LDL: 100"],
444
+ ["Hemoglobin: 9.5, Iron: 40, Ferritin: 15"],
445
+ ["TSH: 8.5, T4: 4.0, T3: 80"],
446
+ ['{"Glucose": 140, "HbA1c": 7.0, "Triglycerides": 250}'],
447
+ ],
448
+ inputs=input_text,
449
+ label="Click an example to load it",
450
+ )
451
+
452
+ # Output column
453
+ with gr.Column(scale=2):
454
+ gr.Markdown("### 📊 Analysis Results")
455
+
456
+ with gr.Tabs():
457
+ with gr.Tab("Summary"):
458
+ summary_output = gr.Markdown(
459
+ label="Analysis Summary",
460
+ value="*Enter biomarkers and click Analyze to see results*"
461
+ )
462
+
463
+ with gr.Tab("Detailed JSON"):
464
+ details_output = gr.Code(
465
+ label="Full Response",
466
+ language="json",
467
+ lines=25,
468
+ )
469
+
470
+ # Event handlers
471
+ analyze_btn.click(
472
+ fn=analyze_biomarkers,
473
+ inputs=[input_text],
474
+ outputs=[summary_output, details_output, status_output],
475
+ show_progress="full",
476
+ )
477
+
478
+ clear_btn.click(
479
+ fn=lambda: ("", "", "", ""),
480
+ outputs=[input_text, summary_output, details_output, status_output],
481
+ )
482
+
483
+ # Footer
484
+ gr.Markdown("""
485
+ ---
486
+
487
+ ### ℹ️ About MediGuard AI
488
+
489
+ MediGuard AI uses a **Clinical Insight Guild** of 6 specialized AI agents:
490
+
491
+ | Agent | Role |
492
+ |-------|------|
493
+ | 🔬 Biomarker Analyzer | Validates and flags abnormal values |
494
+ | 📚 Disease Explainer | RAG-powered pathophysiology explanations |
495
+ | 🔗 Biomarker Linker | Connects biomarkers to disease predictions |
496
+ | 📋 Clinical Guidelines | Evidence-based recommendations from medical literature |
497
+ | ✅ Confidence Assessor | Evaluates reliability of findings |
498
+ | 📝 Response Synthesizer | Compiles comprehensive patient-friendly output |
499
+
500
+ **Data Sources**: 750+ pages of clinical guidelines (FAISS vector store)
501
+
502
+ ---
503
+
504
+ ⚠️ **Medical Disclaimer**: This tool is for **informational purposes only** and does not
505
+ replace professional medical advice, diagnosis, or treatment. Always consult a qualified
506
+ healthcare provider with questions regarding a medical condition.
507
+
508
+ ---
509
+
510
+ Built with ❤️ using [LangGraph](https://langchain-ai.github.io/langgraph/),
511
+ [FAISS](https://faiss.ai/), and [Gradio](https://gradio.app/)
512
+ """)
513
+
514
+ return demo
515
+
516
+
517
+ # ---------------------------------------------------------------------------
518
+ # Main Entry Point
519
+ # ---------------------------------------------------------------------------
520
+
521
+ if __name__ == "__main__":
522
+ logger.info("Starting MediGuard AI Gradio App...")
523
+
524
+ demo = create_demo()
525
+
526
+ # Launch with HF Spaces compatible settings
527
+ demo.launch(
528
+ server_name="0.0.0.0",
529
+ server_port=7860,
530
+ show_error=True,
531
+ # share=False on HF Spaces
532
+ )
huggingface/requirements.txt ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ===========================================================================
2
+ # MediGuard AI — Hugging Face Spaces Dependencies
3
+ # ===========================================================================
4
+ # Minimal dependencies for standalone Gradio deployment.
5
+ # No postgres, redis, opensearch, ollama required.
6
+ # ===========================================================================
7
+
8
+ # --- Gradio UI ---
9
+ gradio>=5.0.0
10
+
11
+ # --- LangChain Core ---
12
+ langchain>=0.3.0
13
+ langchain-community>=0.3.0
14
+ langgraph>=0.2.0
15
+
16
+ # --- Cloud LLM Providers (FREE tiers) ---
17
+ langchain-groq>=0.2.0
18
+ langchain-google-genai>=2.0.0
19
+
20
+ # --- Vector Store ---
21
+ faiss-cpu>=1.8.0
22
+
23
+ # --- Embeddings ---
24
+ sentence-transformers>=3.0.0
25
+
26
+ # --- Document Processing ---
27
+ pypdf>=4.0.0
28
+
29
+ # --- Pydantic ---
30
+ pydantic>=2.9.0
31
+ pydantic-settings>=2.5.0
32
+
33
+ # --- HTTP Client ---
34
+ httpx>=0.27.0
35
+
36
+ # --- Utilities ---
37
+ python-dotenv>=1.0.0
38
+ numpy<2.0.0
scripts/deploy_huggingface.ps1 ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <#
2
+ .SYNOPSIS
3
+ Deploy MediGuard AI to Hugging Face Spaces
4
+ .DESCRIPTION
5
+ This script automates the deployment of MediGuard AI to Hugging Face Spaces.
6
+ It handles copying files, setting up the Dockerfile, and pushing to the Space.
7
+ .PARAMETER SpaceName
8
+ Name of your Hugging Face Space (e.g., "mediguard-ai")
9
+ .PARAMETER Username
10
+ Your Hugging Face username
11
+ .PARAMETER SkipClone
12
+ Skip cloning if you've already cloned the Space
13
+ .EXAMPLE
14
+ .\deploy_huggingface.ps1 -Username "your-username" -SpaceName "mediguard-ai"
15
+ #>
16
+
17
+ param(
18
+ [Parameter(Mandatory=$true)]
19
+ [string]$Username,
20
+
21
+ [Parameter(Mandatory=$false)]
22
+ [string]$SpaceName = "mediguard-ai",
23
+
24
+ [switch]$SkipClone
25
+ )
26
+
27
+ $ErrorActionPreference = "Stop"
28
+
29
+ Write-Host "========================================" -ForegroundColor Cyan
30
+ Write-Host " MediGuard AI - Hugging Face Deployment" -ForegroundColor Cyan
31
+ Write-Host "========================================" -ForegroundColor Cyan
32
+ Write-Host ""
33
+
34
+ # Configuration
35
+ $ProjectRoot = Split-Path -Parent $PSScriptRoot
36
+ $DeployDir = Join-Path $ProjectRoot "hf-deploy"
37
+ $SpaceUrl = "https://huggingface.co/spaces/$Username/$SpaceName"
38
+
39
+ Write-Host "Project Root: $ProjectRoot" -ForegroundColor Gray
40
+ Write-Host "Deploy Dir: $DeployDir" -ForegroundColor Gray
41
+ Write-Host "Space URL: $SpaceUrl" -ForegroundColor Gray
42
+ Write-Host ""
43
+
44
+ # Step 1: Clone or use existing Space
45
+ if (-not $SkipClone) {
46
+ Write-Host "[1/6] Cloning Hugging Face Space..." -ForegroundColor Yellow
47
+
48
+ if (Test-Path $DeployDir) {
49
+ Write-Host " Removing existing deploy directory..." -ForegroundColor Gray
50
+ Remove-Item -Recurse -Force $DeployDir
51
+ }
52
+
53
+ git clone "https://huggingface.co/spaces/$Username/$SpaceName" $DeployDir
54
+
55
+ if ($LASTEXITCODE -ne 0) {
56
+ Write-Host "ERROR: Failed to clone Space. Make sure it exists!" -ForegroundColor Red
57
+ Write-Host "Create it at: https://huggingface.co/new-space" -ForegroundColor Yellow
58
+ exit 1
59
+ }
60
+ } else {
61
+ Write-Host "[1/6] Using existing deploy directory..." -ForegroundColor Yellow
62
+ }
63
+
64
+ # Step 2: Copy project files
65
+ Write-Host "[2/6] Copying project files..." -ForegroundColor Yellow
66
+
67
+ # Core directories
68
+ $CoreDirs = @("src", "config", "data", "huggingface")
69
+ foreach ($dir in $CoreDirs) {
70
+ $source = Join-Path $ProjectRoot $dir
71
+ $dest = Join-Path $DeployDir $dir
72
+ if (Test-Path $source) {
73
+ Write-Host " Copying $dir..." -ForegroundColor Gray
74
+ Copy-Item -Path $source -Destination $dest -Recurse -Force
75
+ }
76
+ }
77
+
78
+ # Copy specific files
79
+ $CoreFiles = @("pyproject.toml", ".dockerignore")
80
+ foreach ($file in $CoreFiles) {
81
+ $source = Join-Path $ProjectRoot $file
82
+ if (Test-Path $source) {
83
+ Write-Host " Copying $file..." -ForegroundColor Gray
84
+ Copy-Item -Path $source -Destination (Join-Path $DeployDir $file) -Force
85
+ }
86
+ }
87
+
88
+ # Step 3: Set up Dockerfile (HF Spaces expects it in root)
89
+ Write-Host "[3/6] Setting up Dockerfile..." -ForegroundColor Yellow
90
+ $HfDockerfile = Join-Path $DeployDir "huggingface/Dockerfile"
91
+ $RootDockerfile = Join-Path $DeployDir "Dockerfile"
92
+ Copy-Item -Path $HfDockerfile -Destination $RootDockerfile -Force
93
+ Write-Host " Copied huggingface/Dockerfile to Dockerfile" -ForegroundColor Gray
94
+
95
+ # Step 4: Set up README with HF metadata
96
+ Write-Host "[4/6] Setting up README.md..." -ForegroundColor Yellow
97
+ $HfReadme = Join-Path $DeployDir "huggingface/README.md"
98
+ $RootReadme = Join-Path $DeployDir "README.md"
99
+ Copy-Item -Path $HfReadme -Destination $RootReadme -Force
100
+ Write-Host " Copied huggingface/README.md to README.md" -ForegroundColor Gray
101
+
102
+ # Step 5: Verify vector store exists
103
+ Write-Host "[5/6] Verifying vector store..." -ForegroundColor Yellow
104
+ $VectorStore = Join-Path $DeployDir "data/vector_stores/medical_knowledge.faiss"
105
+ if (Test-Path $VectorStore) {
106
+ $size = (Get-Item $VectorStore).Length / 1MB
107
+ Write-Host " Vector store found: $([math]::Round($size, 2)) MB" -ForegroundColor Green
108
+ } else {
109
+ Write-Host " WARNING: Vector store not found!" -ForegroundColor Red
110
+ Write-Host " Run 'python scripts/setup_embeddings.py' first to create it." -ForegroundColor Yellow
111
+ }
112
+
113
+ # Step 6: Commit and push
114
+ Write-Host "[6/6] Committing and pushing to Hugging Face..." -ForegroundColor Yellow
115
+
116
+ Push-Location $DeployDir
117
+
118
+ git add .
119
+ git commit -m "Deploy MediGuard AI - $(Get-Date -Format 'yyyy-MM-dd HH:mm')"
120
+
121
+ Write-Host ""
122
+ Write-Host "Ready to push! Run the following command:" -ForegroundColor Green
123
+ Write-Host ""
124
+ Write-Host " cd $DeployDir" -ForegroundColor Cyan
125
+ Write-Host " git push" -ForegroundColor Cyan
126
+ Write-Host ""
127
+ Write-Host "After pushing, add your API key as a Secret in Space Settings:" -ForegroundColor Yellow
128
+ Write-Host " Name: GROQ_API_KEY (or GOOGLE_API_KEY)" -ForegroundColor Gray
129
+ Write-Host " Value: your-api-key" -ForegroundColor Gray
130
+ Write-Host ""
131
+ Write-Host "Your Space will be live at:" -ForegroundColor Green
132
+ Write-Host " $SpaceUrl" -ForegroundColor Cyan
133
+
134
+ Pop-Location
135
+
136
+ Write-Host ""
137
+ Write-Host "========================================" -ForegroundColor Cyan
138
+ Write-Host " Deployment prepared successfully!" -ForegroundColor Green
139
+ Write-Host "========================================" -ForegroundColor Cyan