Spaces:
Running
Running
Nikhil Pravin Pise commited on
Commit ·
d495234
1
Parent(s): f55411e
feat: Enable 100% HF Space capability with dynamic environment variables
Browse files- Add env helper functions for both naming conventions (simple and nested)
- Support all embeddings providers: Jina, Google, HuggingFace
- Enable Langfuse observability configuration
- Make LLM models configurable via environment
- Remove hardcoded values - everything is now dynamic
- Update HF README with complete secrets reference
- Add .env.huggingface template with all options
- Update deployment guide with secrets configuration
- Enhance startup logging to show all enabled features
- Tests passing: 4/4 llm_config, 3/3 settings
- DEPLOY_HUGGINGFACE.md +23 -5
- huggingface/.env.huggingface +69 -0
- huggingface/Dockerfile +5 -1
- huggingface/README.md +31 -5
- huggingface/app.py +149 -18
- huggingface/requirements.txt +4 -1
- src/llm_config.py +72 -13
DEPLOY_HUGGINGFACE.md
CHANGED
|
@@ -65,15 +65,33 @@ mv README.md README_original.md
|
|
| 65 |
cp huggingface/README.md ./README.md
|
| 66 |
```
|
| 67 |
|
| 68 |
-
## Step 6: Add Your API
|
| 69 |
|
| 70 |
1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
|
| 71 |
2. Click **Settings** tab
|
| 72 |
3. Scroll to **Repository Secrets**
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
## Step 7: Push to Deploy
|
| 79 |
|
|
|
|
| 65 |
cp huggingface/README.md ./README.md
|
| 66 |
```
|
| 67 |
|
| 68 |
+
## Step 6: Add Your API Keys (Secrets)
|
| 69 |
|
| 70 |
1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
|
| 71 |
2. Click **Settings** tab
|
| 72 |
3. Scroll to **Repository Secrets**
|
| 73 |
+
|
| 74 |
+
### Required Secrets (pick one)
|
| 75 |
+
|
| 76 |
+
| Secret | Description | Get Free Key |
|
| 77 |
+
|--------|-------------|--------------|
|
| 78 |
+
| `GROQ_API_KEY` | Groq API key (recommended) | [console.groq.com/keys](https://console.groq.com/keys) |
|
| 79 |
+
| `GOOGLE_API_KEY` | Google Gemini API key | [aistudio.google.com](https://aistudio.google.com/app/apikey) |
|
| 80 |
+
|
| 81 |
+
### Optional Secrets
|
| 82 |
+
|
| 83 |
+
| Secret | Description | Default |
|
| 84 |
+
|--------|-------------|---------|
|
| 85 |
+
| `GROQ_MODEL` | Groq model to use | `llama-3.3-70b-versatile` |
|
| 86 |
+
| `GEMINI_MODEL` | Gemini model to use | `gemini-2.0-flash` |
|
| 87 |
+
| `EMBEDDING_PROVIDER` | Embedding provider: `jina`, `google`, `huggingface` | `huggingface` |
|
| 88 |
+
| `JINA_API_KEY` | Jina AI API key for high-quality embeddings | - |
|
| 89 |
+
| `LANGFUSE_ENABLED` | Enable Langfuse tracing (`true`/`false`) | `false` |
|
| 90 |
+
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key | - |
|
| 91 |
+
| `LANGFUSE_SECRET_KEY` | Langfuse secret key | - |
|
| 92 |
+
| `LANGFUSE_HOST` | Langfuse host URL | - |
|
| 93 |
+
|
| 94 |
+
> **Tip**: See `huggingface/.env.huggingface` for a complete reference of all available secrets.
|
| 95 |
|
| 96 |
## Step 7: Push to Deploy
|
| 97 |
|
huggingface/.env.huggingface
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ===========================================================================
|
| 2 |
+
# MediGuard AI — HuggingFace Spaces Secrets Reference
|
| 3 |
+
# ===========================================================================
|
| 4 |
+
# Copy these to your HuggingFace Space Settings → Secrets
|
| 5 |
+
# ===========================================================================
|
| 6 |
+
|
| 7 |
+
# ===========================================================================
|
| 8 |
+
# REQUIRED: LLM API Key (choose one)
|
| 9 |
+
# ===========================================================================
|
| 10 |
+
|
| 11 |
+
# Option 1: Groq (RECOMMENDED - fast, free)
|
| 12 |
+
# Get key at: https://console.groq.com/keys
|
| 13 |
+
GROQ_API_KEY=your_groq_api_key_here
|
| 14 |
+
|
| 15 |
+
# Option 2: Google Gemini (free tier)
|
| 16 |
+
# Get key at: https://aistudio.google.com/app/apikey
|
| 17 |
+
# GOOGLE_API_KEY=your_google_api_key_here
|
| 18 |
+
|
| 19 |
+
# ===========================================================================
|
| 20 |
+
# OPTIONAL: LLM Model Configuration
|
| 21 |
+
# ===========================================================================
|
| 22 |
+
|
| 23 |
+
# Groq model (default: llama-3.3-70b-versatile)
|
| 24 |
+
# Options: llama-3.3-70b-versatile, llama-3.1-8b-instant, mixtral-8x7b-32768
|
| 25 |
+
# GROQ_MODEL=llama-3.3-70b-versatile
|
| 26 |
+
|
| 27 |
+
# Gemini model (default: gemini-2.0-flash)
|
| 28 |
+
# Options: gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash
|
| 29 |
+
# GEMINI_MODEL=gemini-2.0-flash
|
| 30 |
+
|
| 31 |
+
# Force specific provider (auto-detected from keys if not set)
|
| 32 |
+
# LLM_PROVIDER=groq
|
| 33 |
+
|
| 34 |
+
# ===========================================================================
|
| 35 |
+
# OPTIONAL: Embeddings Configuration
|
| 36 |
+
# ===========================================================================
|
| 37 |
+
|
| 38 |
+
# Embedding provider (default: huggingface - local, no API needed)
|
| 39 |
+
# Options: jina (high-quality 1024d), google, huggingface
|
| 40 |
+
# EMBEDDING_PROVIDER=huggingface
|
| 41 |
+
|
| 42 |
+
# Jina AI API key for high-quality embeddings
|
| 43 |
+
# Get key at: https://jina.ai/ (free tier available)
|
| 44 |
+
# JINA_API_KEY=your_jina_api_key_here
|
| 45 |
+
|
| 46 |
+
# ===========================================================================
|
| 47 |
+
# OPTIONAL: Observability (Langfuse)
|
| 48 |
+
# ===========================================================================
|
| 49 |
+
|
| 50 |
+
# Enable Langfuse tracing (default: false)
|
| 51 |
+
# LANGFUSE_ENABLED=true
|
| 52 |
+
|
| 53 |
+
# Langfuse credentials (required if LANGFUSE_ENABLED=true)
|
| 54 |
+
# Get at: https://cloud.langfuse.com/
|
| 55 |
+
# LANGFUSE_PUBLIC_KEY=pk-lf-xxx
|
| 56 |
+
# LANGFUSE_SECRET_KEY=sk-lf-xxx
|
| 57 |
+
# LANGFUSE_HOST=https://cloud.langfuse.com
|
| 58 |
+
|
| 59 |
+
# ===========================================================================
|
| 60 |
+
# Notes:
|
| 61 |
+
# ===========================================================================
|
| 62 |
+
#
|
| 63 |
+
# 1. At minimum, you need either GROQ_API_KEY or GOOGLE_API_KEY
|
| 64 |
+
# 2. Groq is recommended for best speed/quality balance (free tier)
|
| 65 |
+
# 3. HuggingFace embeddings run locally - no API key needed (default)
|
| 66 |
+
# 4. Jina embeddings are higher quality but require API key
|
| 67 |
+
# 5. Langfuse provides observability for debugging and monitoring
|
| 68 |
+
#
|
| 69 |
+
# ===========================================================================
|
huggingface/Dockerfile
CHANGED
|
@@ -20,9 +20,13 @@ ENV PYTHONDONTWRITEBYTECODE=1 \
|
|
| 20 |
ENV GRADIO_SERVER_NAME="0.0.0.0" \
|
| 21 |
GRADIO_SERVER_PORT=7860
|
| 22 |
|
| 23 |
-
# Default
|
|
|
|
| 24 |
ENV EMBEDDING_PROVIDER=huggingface
|
| 25 |
|
|
|
|
|
|
|
|
|
|
| 26 |
WORKDIR /app
|
| 27 |
|
| 28 |
# System dependencies
|
|
|
|
| 20 |
ENV GRADIO_SERVER_NAME="0.0.0.0" \
|
| 21 |
GRADIO_SERVER_PORT=7860
|
| 22 |
|
| 23 |
+
# Default embedding provider (can be overridden by HF Secrets)
|
| 24 |
+
# Options: huggingface (local, no key needed), google, jina
|
| 25 |
ENV EMBEDDING_PROVIDER=huggingface
|
| 26 |
|
| 27 |
+
# Disable HF hub implicit token warning
|
| 28 |
+
ENV HF_HUB_DISABLE_IMPLICIT_TOKEN=1
|
| 29 |
+
|
| 30 |
WORKDIR /app
|
| 31 |
|
| 32 |
# System dependencies
|
huggingface/README.md
CHANGED
|
@@ -44,12 +44,38 @@ A production-ready **Multi-Agent RAG System** that analyzes blood test biomarker
|
|
| 44 |
|
| 45 |
## 🔧 Configuration
|
| 46 |
|
| 47 |
-
This Space requires
|
| 48 |
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
|
| 52 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
## 🏗️ Architecture
|
| 55 |
|
|
|
|
| 44 |
|
| 45 |
## 🔧 Configuration
|
| 46 |
|
| 47 |
+
This Space requires at least one LLM API key. Configure secrets in **Space Settings → Secrets**.
|
| 48 |
|
| 49 |
+
### Required Secrets (pick one)
|
| 50 |
+
|
| 51 |
+
| Secret | Provider | Description | Get Free Key |
|
| 52 |
+
|--------|----------|-------------|--------------|
|
| 53 |
+
| `GROQ_API_KEY` | Groq | **Recommended** - Fast, free LLaMA 3.3-70B | [console.groq.com/keys](https://console.groq.com/keys) |
|
| 54 |
+
| `GOOGLE_API_KEY` | Google Gemini | Free Gemini 2.0 Flash | [aistudio.google.com](https://aistudio.google.com/app/apikey) |
|
| 55 |
+
|
| 56 |
+
### Optional: LLM Configuration
|
| 57 |
+
|
| 58 |
+
| Secret | Default | Description |
|
| 59 |
+
|--------|---------|-------------|
|
| 60 |
+
| `GROQ_MODEL` | `llama-3.3-70b-versatile` | Groq model to use |
|
| 61 |
+
| `GEMINI_MODEL` | `gemini-2.0-flash` | Gemini model to use |
|
| 62 |
+
| `LLM_PROVIDER` | auto-detected | Force provider: `groq` or `gemini` |
|
| 63 |
+
|
| 64 |
+
### Optional: Embeddings
|
| 65 |
+
|
| 66 |
+
| Secret | Default | Description |
|
| 67 |
+
|--------|---------|-------------|
|
| 68 |
+
| `EMBEDDING_PROVIDER` | `huggingface` | Provider: `jina`, `google`, or `huggingface` |
|
| 69 |
+
| `JINA_API_KEY` | - | High-quality 1024d embeddings ([jina.ai](https://jina.ai/)) |
|
| 70 |
+
|
| 71 |
+
### Optional: Observability (Langfuse)
|
| 72 |
+
|
| 73 |
+
| Secret | Description |
|
| 74 |
+
|--------|-------------|
|
| 75 |
+
| `LANGFUSE_ENABLED` | Set to `true` to enable tracing |
|
| 76 |
+
| `LANGFUSE_PUBLIC_KEY` | Langfuse public key |
|
| 77 |
+
| `LANGFUSE_SECRET_KEY` | Langfuse secret key |
|
| 78 |
+
| `LANGFUSE_HOST` | Langfuse host URL (e.g., `https://cloud.langfuse.com`) |
|
| 79 |
|
| 80 |
## 🏗️ Architecture
|
| 81 |
|
huggingface/app.py
CHANGED
|
@@ -4,7 +4,28 @@ MediGuard AI — Hugging Face Spaces Gradio App
|
|
| 4 |
Standalone deployment that uses:
|
| 5 |
- FAISS vector store (local)
|
| 6 |
- Cloud LLMs (Groq or Gemini - FREE tiers)
|
| 7 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
"""
|
| 9 |
|
| 10 |
from __future__ import annotations
|
|
@@ -33,37 +54,122 @@ logging.basicConfig(
|
|
| 33 |
logger = logging.getLogger("mediguard.huggingface")
|
| 34 |
|
| 35 |
# ---------------------------------------------------------------------------
|
| 36 |
-
# Configuration
|
| 37 |
# ---------------------------------------------------------------------------
|
| 38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
def get_api_keys():
|
| 40 |
-
"""Get API keys dynamically (HuggingFace injects secrets after module load).
|
| 41 |
-
|
| 42 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
return groq_key, google_key
|
| 44 |
|
| 45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
def setup_llm_provider():
|
| 47 |
-
"""Set LLM provider based on available keys.
|
|
|
|
|
|
|
|
|
|
| 48 |
groq_key, google_key = get_api_keys()
|
|
|
|
| 49 |
|
| 50 |
if groq_key:
|
| 51 |
os.environ["LLM_PROVIDER"] = "groq"
|
| 52 |
-
os.environ["GROQ_API_KEY"] = groq_key
|
| 53 |
-
|
|
|
|
|
|
|
| 54 |
elif google_key:
|
| 55 |
os.environ["LLM_PROVIDER"] = "gemini"
|
| 56 |
os.environ["GOOGLE_API_KEY"] = google_key
|
| 57 |
-
|
| 58 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
|
| 61 |
# Log status at startup (keys may not be available yet)
|
| 62 |
_groq, _google = get_api_keys()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
if not _groq and not _google:
|
| 64 |
logger.warning(
|
| 65 |
"No LLM API key found at startup. Will check again when analyzing."
|
| 66 |
)
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
|
| 69 |
# ---------------------------------------------------------------------------
|
|
@@ -103,9 +209,11 @@ def get_guild():
|
|
| 103 |
|
| 104 |
try:
|
| 105 |
logger.info("Initializing Clinical Insight Guild...")
|
| 106 |
-
logger.info(f"LLM_PROVIDER
|
| 107 |
-
logger.info(f"GROQ_API_KEY
|
| 108 |
-
logger.info(f"GOOGLE_API_KEY
|
|
|
|
|
|
|
| 109 |
|
| 110 |
start = time.time()
|
| 111 |
|
|
@@ -191,10 +299,25 @@ def analyze_biomarkers(input_text: str, progress=gr.Progress()) -> tuple[str, st
|
|
| 191 |
<div style="background: linear-gradient(135deg, #fee2e2 0%, #fecaca 100%); border: 1px solid #ef4444; border-radius: 10px; padding: 16px;">
|
| 192 |
<strong style="color: #dc2626;">❌ No API Key Configured</strong>
|
| 193 |
<p style="margin: 12px 0 8px 0; color: #991b1b;">Please add your API key in Space Settings → Secrets:</p>
|
| 194 |
-
|
| 195 |
-
|
| 196 |
-
<
|
| 197 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 198 |
</div>
|
| 199 |
"""
|
| 200 |
|
|
@@ -837,6 +960,11 @@ def create_demo() -> gr.Blocks:
|
|
| 837 |
<strong>Setup Required:</strong> Add your <code>GROQ_API_KEY</code> or
|
| 838 |
<code>GOOGLE_API_KEY</code> in Space Settings → Secrets to enable analysis.
|
| 839 |
<a href="https://console.groq.com/keys" target="_blank" style="color: #2563eb;">Get free Groq key →</a>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 840 |
</div>
|
| 841 |
</div>
|
| 842 |
""")
|
|
@@ -999,7 +1127,10 @@ def create_demo() -> gr.Blocks:
|
|
| 999 |
<a href="https://faiss.ai/" target="_blank" style="color: #3b82f6;">FAISS</a>, and
|
| 1000 |
<a href="https://gradio.app/" target="_blank" style="color: #3b82f6;">Gradio</a>
|
| 1001 |
</p>
|
| 1002 |
-
<p style="margin-top: 8px;">
|
|
|
|
|
|
|
|
|
|
| 1003 |
</div>
|
| 1004 |
""")
|
| 1005 |
|
|
|
|
| 4 |
Standalone deployment that uses:
|
| 5 |
- FAISS vector store (local)
|
| 6 |
- Cloud LLMs (Groq or Gemini - FREE tiers)
|
| 7 |
+
- Multiple embedding providers (Jina, Google, HuggingFace)
|
| 8 |
+
- Optional Langfuse observability
|
| 9 |
+
|
| 10 |
+
Environment Variables (HuggingFace Secrets):
|
| 11 |
+
Required (pick one):
|
| 12 |
+
- GROQ_API_KEY: Groq API key (recommended, free)
|
| 13 |
+
- GOOGLE_API_KEY: Google Gemini API key (free)
|
| 14 |
+
|
| 15 |
+
Optional - LLM Configuration:
|
| 16 |
+
- LLM_PROVIDER: "groq" or "gemini" (auto-detected from keys)
|
| 17 |
+
- GROQ_MODEL: Model name (default: llama-3.3-70b-versatile)
|
| 18 |
+
- GEMINI_MODEL: Model name (default: gemini-2.0-flash)
|
| 19 |
+
|
| 20 |
+
Optional - Embeddings:
|
| 21 |
+
- EMBEDDING_PROVIDER: "jina", "google", or "huggingface" (default: huggingface)
|
| 22 |
+
- JINA_API_KEY: Jina AI API key for high-quality embeddings
|
| 23 |
+
|
| 24 |
+
Optional - Observability:
|
| 25 |
+
- LANGFUSE_ENABLED: "true" to enable tracing
|
| 26 |
+
- LANGFUSE_PUBLIC_KEY: Langfuse public key
|
| 27 |
+
- LANGFUSE_SECRET_KEY: Langfuse secret key
|
| 28 |
+
- LANGFUSE_HOST: Langfuse host URL
|
| 29 |
"""
|
| 30 |
|
| 31 |
from __future__ import annotations
|
|
|
|
| 54 |
logger = logging.getLogger("mediguard.huggingface")
|
| 55 |
|
| 56 |
# ---------------------------------------------------------------------------
|
| 57 |
+
# Configuration - Environment Variable Helpers
|
| 58 |
# ---------------------------------------------------------------------------
|
| 59 |
|
| 60 |
+
def _get_env(primary: str, *fallbacks, default: str = "") -> str:
|
| 61 |
+
"""Get env var with multiple fallback names for compatibility."""
|
| 62 |
+
value = os.getenv(primary)
|
| 63 |
+
if value:
|
| 64 |
+
return value
|
| 65 |
+
for fb in fallbacks:
|
| 66 |
+
value = os.getenv(fb)
|
| 67 |
+
if value:
|
| 68 |
+
return value
|
| 69 |
+
return default
|
| 70 |
+
|
| 71 |
+
|
| 72 |
def get_api_keys():
|
| 73 |
+
"""Get API keys dynamically (HuggingFace injects secrets after module load).
|
| 74 |
+
|
| 75 |
+
Supports both simple and nested naming conventions:
|
| 76 |
+
- GROQ_API_KEY / LLM__GROQ_API_KEY
|
| 77 |
+
- GOOGLE_API_KEY / LLM__GOOGLE_API_KEY
|
| 78 |
+
"""
|
| 79 |
+
groq_key = _get_env("GROQ_API_KEY", "LLM__GROQ_API_KEY")
|
| 80 |
+
google_key = _get_env("GOOGLE_API_KEY", "LLM__GOOGLE_API_KEY")
|
| 81 |
return groq_key, google_key
|
| 82 |
|
| 83 |
|
| 84 |
+
def get_jina_api_key() -> str:
|
| 85 |
+
"""Get Jina API key for embeddings."""
|
| 86 |
+
return _get_env("JINA_API_KEY", "EMBEDDING__JINA_API_KEY")
|
| 87 |
+
|
| 88 |
+
|
| 89 |
+
def get_embedding_provider() -> str:
|
| 90 |
+
"""Get configured embedding provider."""
|
| 91 |
+
return _get_env("EMBEDDING_PROVIDER", "EMBEDDING__PROVIDER", default="huggingface")
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
def get_groq_model() -> str:
|
| 95 |
+
"""Get configured Groq model name."""
|
| 96 |
+
return _get_env("GROQ_MODEL", "LLM__GROQ_MODEL", default="llama-3.3-70b-versatile")
|
| 97 |
+
|
| 98 |
+
|
| 99 |
+
def get_gemini_model() -> str:
|
| 100 |
+
"""Get configured Gemini model name."""
|
| 101 |
+
return _get_env("GEMINI_MODEL", "LLM__GEMINI_MODEL", default="gemini-2.0-flash")
|
| 102 |
+
|
| 103 |
+
|
| 104 |
+
def is_langfuse_enabled() -> bool:
|
| 105 |
+
"""Check if Langfuse observability is enabled."""
|
| 106 |
+
enabled = _get_env("LANGFUSE_ENABLED", "LANGFUSE__ENABLED", default="false")
|
| 107 |
+
return enabled.lower() in ("true", "1", "yes")
|
| 108 |
+
|
| 109 |
+
|
| 110 |
def setup_llm_provider():
|
| 111 |
+
"""Set up LLM provider and related configuration based on available keys.
|
| 112 |
+
|
| 113 |
+
Sets environment variables for the entire application to use.
|
| 114 |
+
"""
|
| 115 |
groq_key, google_key = get_api_keys()
|
| 116 |
+
provider = None
|
| 117 |
|
| 118 |
if groq_key:
|
| 119 |
os.environ["LLM_PROVIDER"] = "groq"
|
| 120 |
+
os.environ["GROQ_API_KEY"] = groq_key
|
| 121 |
+
os.environ["GROQ_MODEL"] = get_groq_model()
|
| 122 |
+
provider = "groq"
|
| 123 |
+
logger.info(f"Configured Groq provider with model: {get_groq_model()}")
|
| 124 |
elif google_key:
|
| 125 |
os.environ["LLM_PROVIDER"] = "gemini"
|
| 126 |
os.environ["GOOGLE_API_KEY"] = google_key
|
| 127 |
+
os.environ["GEMINI_MODEL"] = get_gemini_model()
|
| 128 |
+
provider = "gemini"
|
| 129 |
+
logger.info(f"Configured Gemini provider with model: {get_gemini_model()}")
|
| 130 |
+
|
| 131 |
+
# Set up embedding provider
|
| 132 |
+
embedding_provider = get_embedding_provider()
|
| 133 |
+
os.environ["EMBEDDING_PROVIDER"] = embedding_provider
|
| 134 |
+
|
| 135 |
+
# If Jina is configured, set the API key
|
| 136 |
+
jina_key = get_jina_api_key()
|
| 137 |
+
if jina_key:
|
| 138 |
+
os.environ["JINA_API_KEY"] = jina_key
|
| 139 |
+
os.environ["EMBEDDING__JINA_API_KEY"] = jina_key
|
| 140 |
+
logger.info("Jina embeddings configured")
|
| 141 |
+
|
| 142 |
+
# Set up Langfuse if enabled
|
| 143 |
+
if is_langfuse_enabled():
|
| 144 |
+
os.environ["LANGFUSE__ENABLED"] = "true"
|
| 145 |
+
for var in ["LANGFUSE_PUBLIC_KEY", "LANGFUSE_SECRET_KEY", "LANGFUSE_HOST"]:
|
| 146 |
+
val = _get_env(var, f"LANGFUSE__{var.split('_', 1)[1]}")
|
| 147 |
+
if val:
|
| 148 |
+
os.environ[var] = val
|
| 149 |
+
logger.info("Langfuse observability enabled")
|
| 150 |
+
|
| 151 |
+
return provider
|
| 152 |
|
| 153 |
|
| 154 |
# Log status at startup (keys may not be available yet)
|
| 155 |
_groq, _google = get_api_keys()
|
| 156 |
+
_jina = get_jina_api_key()
|
| 157 |
+
logger.info("=" * 60)
|
| 158 |
+
logger.info("MediGuard AI — HuggingFace Space Starting")
|
| 159 |
+
logger.info("=" * 60)
|
| 160 |
+
logger.info(f"GROQ_API_KEY: {'✓ configured' if _groq else '✗ not set'}")
|
| 161 |
+
logger.info(f"GOOGLE_API_KEY: {'✓ configured' if _google else '✗ not set'}")
|
| 162 |
+
logger.info(f"JINA_API_KEY: {'✓ configured' if _jina else '✗ not set (using HuggingFace embeddings)'}")
|
| 163 |
+
logger.info(f"EMBEDDING_PROVIDER: {get_embedding_provider()}")
|
| 164 |
+
logger.info(f"LANGFUSE: {'✓ enabled' if is_langfuse_enabled() else '✗ disabled'}")
|
| 165 |
+
|
| 166 |
if not _groq and not _google:
|
| 167 |
logger.warning(
|
| 168 |
"No LLM API key found at startup. Will check again when analyzing."
|
| 169 |
)
|
| 170 |
+
else:
|
| 171 |
+
logger.info("LLM API key available — ready for analysis")
|
| 172 |
+
logger.info("=" * 60)
|
| 173 |
|
| 174 |
|
| 175 |
# ---------------------------------------------------------------------------
|
|
|
|
| 209 |
|
| 210 |
try:
|
| 211 |
logger.info("Initializing Clinical Insight Guild...")
|
| 212 |
+
logger.info(f" LLM_PROVIDER: {os.getenv('LLM_PROVIDER', 'not set')}")
|
| 213 |
+
logger.info(f" GROQ_API_KEY: {'✓ set' if os.getenv('GROQ_API_KEY') else '✗ not set'}")
|
| 214 |
+
logger.info(f" GOOGLE_API_KEY: {'✓ set' if os.getenv('GOOGLE_API_KEY') else '✗ not set'}")
|
| 215 |
+
logger.info(f" EMBEDDING_PROVIDER: {os.getenv('EMBEDDING_PROVIDER', 'huggingface')}")
|
| 216 |
+
logger.info(f" JINA_API_KEY: {'✓ set' if os.getenv('JINA_API_KEY') else '✗ not set'}")
|
| 217 |
|
| 218 |
start = time.time()
|
| 219 |
|
|
|
|
| 299 |
<div style="background: linear-gradient(135deg, #fee2e2 0%, #fecaca 100%); border: 1px solid #ef4444; border-radius: 10px; padding: 16px;">
|
| 300 |
<strong style="color: #dc2626;">❌ No API Key Configured</strong>
|
| 301 |
<p style="margin: 12px 0 8px 0; color: #991b1b;">Please add your API key in Space Settings → Secrets:</p>
|
| 302 |
+
|
| 303 |
+
<div style="margin: 12px 0;">
|
| 304 |
+
<strong style="color: #374151;">Required (pick one):</strong>
|
| 305 |
+
<ul style="margin: 4px 0; color: #7f1d1d;">
|
| 306 |
+
<li><code>GROQ_API_KEY</code> - <a href="https://console.groq.com/keys" target="_blank" style="color: #2563eb;">Get free key →</a> (Recommended)</li>
|
| 307 |
+
<li><code>GOOGLE_API_KEY</code> - <a href="https://aistudio.google.com/app/apikey" target="_blank" style="color: #2563eb;">Get free key →</a></li>
|
| 308 |
+
</ul>
|
| 309 |
+
</div>
|
| 310 |
+
|
| 311 |
+
<details style="margin-top: 12px;">
|
| 312 |
+
<summary style="cursor: pointer; color: #374151; font-weight: 600;">Optional configuration secrets</summary>
|
| 313 |
+
<ul style="margin: 8px 0; color: #6b7280; font-size: 0.9em;">
|
| 314 |
+
<li><code>GROQ_MODEL</code> - Model name (default: llama-3.3-70b-versatile)</li>
|
| 315 |
+
<li><code>GEMINI_MODEL</code> - Model name (default: gemini-2.0-flash)</li>
|
| 316 |
+
<li><code>JINA_API_KEY</code> - High-quality embeddings (optional)</li>
|
| 317 |
+
<li><code>EMBEDDING_PROVIDER</code> - jina, google, or huggingface</li>
|
| 318 |
+
<li><code>LANGFUSE_ENABLED</code> - Enable observability tracing</li>
|
| 319 |
+
</ul>
|
| 320 |
+
</details>
|
| 321 |
</div>
|
| 322 |
"""
|
| 323 |
|
|
|
|
| 960 |
<strong>Setup Required:</strong> Add your <code>GROQ_API_KEY</code> or
|
| 961 |
<code>GOOGLE_API_KEY</code> in Space Settings → Secrets to enable analysis.
|
| 962 |
<a href="https://console.groq.com/keys" target="_blank" style="color: #2563eb;">Get free Groq key →</a>
|
| 963 |
+
<br>
|
| 964 |
+
<span style="font-size: 0.9em; color: #64748b;">
|
| 965 |
+
Optional: Configure <code>JINA_API_KEY</code> for high-quality embeddings,
|
| 966 |
+
<code>LANGFUSE_ENABLED=true</code> for observability.
|
| 967 |
+
</span>
|
| 968 |
</div>
|
| 969 |
</div>
|
| 970 |
""")
|
|
|
|
| 1127 |
<a href="https://faiss.ai/" target="_blank" style="color: #3b82f6;">FAISS</a>, and
|
| 1128 |
<a href="https://gradio.app/" target="_blank" style="color: #3b82f6;">Gradio</a>
|
| 1129 |
</p>
|
| 1130 |
+
<p style="margin-top: 8px;">
|
| 1131 |
+
Powered by <strong>Groq</strong> or <strong>Google Gemini</strong> •
|
| 1132 |
+
<a href="https://github.com" target="_blank" style="color: #3b82f6;">Open Source on GitHub</a>
|
| 1133 |
+
</p>
|
| 1134 |
</div>
|
| 1135 |
""")
|
| 1136 |
|
huggingface/requirements.txt
CHANGED
|
@@ -33,10 +33,13 @@ pypdf>=4.0.0
|
|
| 33 |
pydantic>=2.9.0
|
| 34 |
pydantic-settings>=2.5.0
|
| 35 |
|
| 36 |
-
# --- HTTP Client ---
|
| 37 |
httpx>=0.27.0
|
| 38 |
|
| 39 |
# --- Utilities ---
|
| 40 |
python-dotenv>=1.0.0
|
| 41 |
tenacity>=8.0.0
|
| 42 |
numpy<2.0.0
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
pydantic>=2.9.0
|
| 34 |
pydantic-settings>=2.5.0
|
| 35 |
|
| 36 |
+
# --- HTTP Client (for Jina AI embeddings) ---
|
| 37 |
httpx>=0.27.0
|
| 38 |
|
| 39 |
# --- Utilities ---
|
| 40 |
python-dotenv>=1.0.0
|
| 41 |
tenacity>=8.0.0
|
| 42 |
numpy<2.0.0
|
| 43 |
+
|
| 44 |
+
# --- Observability (optional, for Langfuse support) ---
|
| 45 |
+
langfuse>=2.0.0
|
src/llm_config.py
CHANGED
|
@@ -6,6 +6,10 @@ Supports multiple providers:
|
|
| 6 |
- Groq (FREE, fast, llama-3.3-70b) - RECOMMENDED
|
| 7 |
- Google Gemini (FREE tier)
|
| 8 |
- Ollama (local, for offline use)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
"""
|
| 10 |
|
| 11 |
import os
|
|
@@ -20,9 +24,39 @@ load_dotenv()
|
|
| 20 |
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT", "MediGuard_AI_RAG_Helper")
|
| 21 |
|
| 22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
def get_default_llm_provider() -> str:
|
| 24 |
-
"""Get default LLM provider dynamically from environment.
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
|
| 28 |
# For backward compatibility (but prefer using get_default_llm_provider())
|
|
@@ -53,15 +87,15 @@ def get_chat_model(
|
|
| 53 |
if provider == "groq":
|
| 54 |
from langchain_groq import ChatGroq
|
| 55 |
|
| 56 |
-
api_key =
|
| 57 |
if not api_key:
|
| 58 |
raise ValueError(
|
| 59 |
"GROQ_API_KEY not found in environment.\n"
|
| 60 |
"Get your FREE API key at: https://console.groq.com/keys"
|
| 61 |
)
|
| 62 |
|
| 63 |
-
#
|
| 64 |
-
model = model or
|
| 65 |
|
| 66 |
return ChatGroq(
|
| 67 |
model=model,
|
|
@@ -73,15 +107,15 @@ def get_chat_model(
|
|
| 73 |
elif provider == "gemini":
|
| 74 |
from langchain_google_genai import ChatGoogleGenerativeAI
|
| 75 |
|
| 76 |
-
api_key =
|
| 77 |
if not api_key:
|
| 78 |
raise ValueError(
|
| 79 |
"GOOGLE_API_KEY not found in environment.\n"
|
| 80 |
"Get your FREE API key at: https://aistudio.google.com/app/apikey"
|
| 81 |
)
|
| 82 |
|
| 83 |
-
#
|
| 84 |
-
model = model or
|
| 85 |
|
| 86 |
return ChatGoogleGenerativeAI(
|
| 87 |
model=model,
|
|
@@ -108,22 +142,47 @@ def get_chat_model(
|
|
| 108 |
raise ValueError(f"Unknown provider: {provider}. Use 'groq', 'gemini', or 'ollama'")
|
| 109 |
|
| 110 |
|
| 111 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
"""
|
| 113 |
Get embedding model for vector search.
|
| 114 |
|
| 115 |
Args:
|
| 116 |
-
provider: "
|
| 117 |
|
| 118 |
Returns:
|
| 119 |
LangChain embedding model instance
|
|
|
|
|
|
|
|
|
|
|
|
|
| 120 |
"""
|
| 121 |
-
provider = provider or
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 122 |
|
| 123 |
-
|
| 124 |
from langchain_google_genai import GoogleGenerativeAIEmbeddings
|
| 125 |
|
| 126 |
-
api_key =
|
| 127 |
if not api_key:
|
| 128 |
print("WARN: GOOGLE_API_KEY not found. Falling back to HuggingFace embeddings.")
|
| 129 |
return get_embedding_model("huggingface")
|
|
|
|
| 6 |
- Groq (FREE, fast, llama-3.3-70b) - RECOMMENDED
|
| 7 |
- Google Gemini (FREE tier)
|
| 8 |
- Ollama (local, for offline use)
|
| 9 |
+
|
| 10 |
+
Environment Variables (supports both naming conventions):
|
| 11 |
+
- Simple: GROQ_API_KEY, GOOGLE_API_KEY, LLM_PROVIDER, GROQ_MODEL, etc.
|
| 12 |
+
- Nested: LLM__GROQ_API_KEY, LLM__GOOGLE_API_KEY, LLM__PROVIDER, etc.
|
| 13 |
"""
|
| 14 |
|
| 15 |
import os
|
|
|
|
| 24 |
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT", "MediGuard_AI_RAG_Helper")
|
| 25 |
|
| 26 |
|
| 27 |
+
def _get_env_with_fallback(primary: str, fallback: str, default: str = "") -> str:
|
| 28 |
+
"""Get env var with fallback to alternate naming convention."""
|
| 29 |
+
return os.getenv(primary) or os.getenv(fallback) or default
|
| 30 |
+
|
| 31 |
+
|
| 32 |
def get_default_llm_provider() -> str:
|
| 33 |
+
"""Get default LLM provider dynamically from environment.
|
| 34 |
+
|
| 35 |
+
Supports both naming conventions:
|
| 36 |
+
- LLM_PROVIDER (simple)
|
| 37 |
+
- LLM__PROVIDER (pydantic nested)
|
| 38 |
+
"""
|
| 39 |
+
return _get_env_with_fallback("LLM_PROVIDER", "LLM__PROVIDER", "groq")
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
def get_groq_api_key() -> str:
|
| 43 |
+
"""Get Groq API key from environment (supports both naming conventions)."""
|
| 44 |
+
return _get_env_with_fallback("GROQ_API_KEY", "LLM__GROQ_API_KEY", "")
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
def get_google_api_key() -> str:
|
| 48 |
+
"""Get Google API key from environment (supports both naming conventions)."""
|
| 49 |
+
return _get_env_with_fallback("GOOGLE_API_KEY", "LLM__GOOGLE_API_KEY", "")
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
def get_groq_model() -> str:
|
| 53 |
+
"""Get Groq model from environment (supports both naming conventions)."""
|
| 54 |
+
return _get_env_with_fallback("GROQ_MODEL", "LLM__GROQ_MODEL", "llama-3.3-70b-versatile")
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
def get_gemini_model() -> str:
|
| 58 |
+
"""Get Gemini model from environment (supports both naming conventions)."""
|
| 59 |
+
return _get_env_with_fallback("GEMINI_MODEL", "LLM__GEMINI_MODEL", "gemini-2.0-flash")
|
| 60 |
|
| 61 |
|
| 62 |
# For backward compatibility (but prefer using get_default_llm_provider())
|
|
|
|
| 87 |
if provider == "groq":
|
| 88 |
from langchain_groq import ChatGroq
|
| 89 |
|
| 90 |
+
api_key = get_groq_api_key()
|
| 91 |
if not api_key:
|
| 92 |
raise ValueError(
|
| 93 |
"GROQ_API_KEY not found in environment.\n"
|
| 94 |
"Get your FREE API key at: https://console.groq.com/keys"
|
| 95 |
)
|
| 96 |
|
| 97 |
+
# Use model from environment or default
|
| 98 |
+
model = model or get_groq_model()
|
| 99 |
|
| 100 |
return ChatGroq(
|
| 101 |
model=model,
|
|
|
|
| 107 |
elif provider == "gemini":
|
| 108 |
from langchain_google_genai import ChatGoogleGenerativeAI
|
| 109 |
|
| 110 |
+
api_key = get_google_api_key()
|
| 111 |
if not api_key:
|
| 112 |
raise ValueError(
|
| 113 |
"GOOGLE_API_KEY not found in environment.\n"
|
| 114 |
"Get your FREE API key at: https://aistudio.google.com/app/apikey"
|
| 115 |
)
|
| 116 |
|
| 117 |
+
# Use model from environment or default
|
| 118 |
+
model = model or get_gemini_model()
|
| 119 |
|
| 120 |
return ChatGoogleGenerativeAI(
|
| 121 |
model=model,
|
|
|
|
| 142 |
raise ValueError(f"Unknown provider: {provider}. Use 'groq', 'gemini', or 'ollama'")
|
| 143 |
|
| 144 |
|
| 145 |
+
def get_embedding_provider() -> str:
|
| 146 |
+
"""Get embedding provider from environment (supports both naming conventions)."""
|
| 147 |
+
return _get_env_with_fallback("EMBEDDING_PROVIDER", "EMBEDDING__PROVIDER", "huggingface")
|
| 148 |
+
|
| 149 |
+
|
| 150 |
+
def get_embedding_model(provider: Optional[Literal["jina", "google", "huggingface", "ollama"]] = None):
|
| 151 |
"""
|
| 152 |
Get embedding model for vector search.
|
| 153 |
|
| 154 |
Args:
|
| 155 |
+
provider: "jina" (high-quality), "google" (free), "huggingface" (local), or "ollama" (local)
|
| 156 |
|
| 157 |
Returns:
|
| 158 |
LangChain embedding model instance
|
| 159 |
+
|
| 160 |
+
Note:
|
| 161 |
+
For production use, prefer src.services.embeddings.service.make_embedding_service()
|
| 162 |
+
which has automatic fallback chain: Jina → Google → HuggingFace.
|
| 163 |
"""
|
| 164 |
+
provider = provider or get_embedding_provider()
|
| 165 |
+
|
| 166 |
+
if provider == "jina":
|
| 167 |
+
# Try Jina AI embeddings first (high quality, 1024d)
|
| 168 |
+
jina_key = _get_env_with_fallback("JINA_API_KEY", "EMBEDDING__JINA_API_KEY", "")
|
| 169 |
+
if jina_key:
|
| 170 |
+
try:
|
| 171 |
+
# Use the embedding service for Jina
|
| 172 |
+
from src.services.embeddings.service import make_embedding_service
|
| 173 |
+
return make_embedding_service()
|
| 174 |
+
except Exception as e:
|
| 175 |
+
print(f"WARN: Jina embeddings failed: {e}")
|
| 176 |
+
print("INFO: Falling back to Google embeddings...")
|
| 177 |
+
return get_embedding_model("google")
|
| 178 |
+
else:
|
| 179 |
+
print("WARN: JINA_API_KEY not found. Falling back to Google embeddings.")
|
| 180 |
+
return get_embedding_model("google")
|
| 181 |
|
| 182 |
+
elif provider == "google":
|
| 183 |
from langchain_google_genai import GoogleGenerativeAIEmbeddings
|
| 184 |
|
| 185 |
+
api_key = get_google_api_key()
|
| 186 |
if not api_key:
|
| 187 |
print("WARN: GOOGLE_API_KEY not found. Falling back to HuggingFace embeddings.")
|
| 188 |
return get_embedding_model("huggingface")
|