Nikhil Pravin Pise commited on
Commit
d495234
·
1 Parent(s): f55411e

feat: Enable 100% HF Space capability with dynamic environment variables

Browse files

- Add env helper functions for both naming conventions (simple and nested)
- Support all embeddings providers: Jina, Google, HuggingFace
- Enable Langfuse observability configuration
- Make LLM models configurable via environment
- Remove hardcoded values - everything is now dynamic
- Update HF README with complete secrets reference
- Add .env.huggingface template with all options
- Update deployment guide with secrets configuration
- Enhance startup logging to show all enabled features
- Tests passing: 4/4 llm_config, 3/3 settings

DEPLOY_HUGGINGFACE.md CHANGED
@@ -65,15 +65,33 @@ mv README.md README_original.md
65
  cp huggingface/README.md ./README.md
66
  ```
67
 
68
- ## Step 6: Add Your API Key (Secret)
69
 
70
  1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
71
  2. Click **Settings** tab
72
  3. Scroll to **Repository Secrets**
73
- 4. Add a new secret:
74
- - **Name**: `GROQ_API_KEY` (or `GOOGLE_API_KEY`)
75
- - **Value**: Your API key
76
- 5. Click **Add**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
  ## Step 7: Push to Deploy
79
 
 
65
  cp huggingface/README.md ./README.md
66
  ```
67
 
68
+ ## Step 6: Add Your API Keys (Secrets)
69
 
70
  1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/mediguard-ai`
71
  2. Click **Settings** tab
72
  3. Scroll to **Repository Secrets**
73
+
74
+ ### Required Secrets (pick one)
75
+
76
+ | Secret | Description | Get Free Key |
77
+ |--------|-------------|--------------|
78
+ | `GROQ_API_KEY` | Groq API key (recommended) | [console.groq.com/keys](https://console.groq.com/keys) |
79
+ | `GOOGLE_API_KEY` | Google Gemini API key | [aistudio.google.com](https://aistudio.google.com/app/apikey) |
80
+
81
+ ### Optional Secrets
82
+
83
+ | Secret | Description | Default |
84
+ |--------|-------------|---------|
85
+ | `GROQ_MODEL` | Groq model to use | `llama-3.3-70b-versatile` |
86
+ | `GEMINI_MODEL` | Gemini model to use | `gemini-2.0-flash` |
87
+ | `EMBEDDING_PROVIDER` | Embedding provider: `jina`, `google`, `huggingface` | `huggingface` |
88
+ | `JINA_API_KEY` | Jina AI API key for high-quality embeddings | - |
89
+ | `LANGFUSE_ENABLED` | Enable Langfuse tracing (`true`/`false`) | `false` |
90
+ | `LANGFUSE_PUBLIC_KEY` | Langfuse public key | - |
91
+ | `LANGFUSE_SECRET_KEY` | Langfuse secret key | - |
92
+ | `LANGFUSE_HOST` | Langfuse host URL | - |
93
+
94
+ > **Tip**: See `huggingface/.env.huggingface` for a complete reference of all available secrets.
95
 
96
  ## Step 7: Push to Deploy
97
 
huggingface/.env.huggingface ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ===========================================================================
2
+ # MediGuard AI — HuggingFace Spaces Secrets Reference
3
+ # ===========================================================================
4
+ # Copy these to your HuggingFace Space Settings → Secrets
5
+ # ===========================================================================
6
+
7
+ # ===========================================================================
8
+ # REQUIRED: LLM API Key (choose one)
9
+ # ===========================================================================
10
+
11
+ # Option 1: Groq (RECOMMENDED - fast, free)
12
+ # Get key at: https://console.groq.com/keys
13
+ GROQ_API_KEY=your_groq_api_key_here
14
+
15
+ # Option 2: Google Gemini (free tier)
16
+ # Get key at: https://aistudio.google.com/app/apikey
17
+ # GOOGLE_API_KEY=your_google_api_key_here
18
+
19
+ # ===========================================================================
20
+ # OPTIONAL: LLM Model Configuration
21
+ # ===========================================================================
22
+
23
+ # Groq model (default: llama-3.3-70b-versatile)
24
+ # Options: llama-3.3-70b-versatile, llama-3.1-8b-instant, mixtral-8x7b-32768
25
+ # GROQ_MODEL=llama-3.3-70b-versatile
26
+
27
+ # Gemini model (default: gemini-2.0-flash)
28
+ # Options: gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash
29
+ # GEMINI_MODEL=gemini-2.0-flash
30
+
31
+ # Force specific provider (auto-detected from keys if not set)
32
+ # LLM_PROVIDER=groq
33
+
34
+ # ===========================================================================
35
+ # OPTIONAL: Embeddings Configuration
36
+ # ===========================================================================
37
+
38
+ # Embedding provider (default: huggingface - local, no API needed)
39
+ # Options: jina (high-quality 1024d), google, huggingface
40
+ # EMBEDDING_PROVIDER=huggingface
41
+
42
+ # Jina AI API key for high-quality embeddings
43
+ # Get key at: https://jina.ai/ (free tier available)
44
+ # JINA_API_KEY=your_jina_api_key_here
45
+
46
+ # ===========================================================================
47
+ # OPTIONAL: Observability (Langfuse)
48
+ # ===========================================================================
49
+
50
+ # Enable Langfuse tracing (default: false)
51
+ # LANGFUSE_ENABLED=true
52
+
53
+ # Langfuse credentials (required if LANGFUSE_ENABLED=true)
54
+ # Get at: https://cloud.langfuse.com/
55
+ # LANGFUSE_PUBLIC_KEY=pk-lf-xxx
56
+ # LANGFUSE_SECRET_KEY=sk-lf-xxx
57
+ # LANGFUSE_HOST=https://cloud.langfuse.com
58
+
59
+ # ===========================================================================
60
+ # Notes:
61
+ # ===========================================================================
62
+ #
63
+ # 1. At minimum, you need either GROQ_API_KEY or GOOGLE_API_KEY
64
+ # 2. Groq is recommended for best speed/quality balance (free tier)
65
+ # 3. HuggingFace embeddings run locally - no API key needed (default)
66
+ # 4. Jina embeddings are higher quality but require API key
67
+ # 5. Langfuse provides observability for debugging and monitoring
68
+ #
69
+ # ===========================================================================
huggingface/Dockerfile CHANGED
@@ -20,9 +20,13 @@ ENV PYTHONDONTWRITEBYTECODE=1 \
20
  ENV GRADIO_SERVER_NAME="0.0.0.0" \
21
  GRADIO_SERVER_PORT=7860
22
 
23
- # Default to HuggingFace embeddings (local, no API key needed)
 
24
  ENV EMBEDDING_PROVIDER=huggingface
25
 
 
 
 
26
  WORKDIR /app
27
 
28
  # System dependencies
 
20
  ENV GRADIO_SERVER_NAME="0.0.0.0" \
21
  GRADIO_SERVER_PORT=7860
22
 
23
+ # Default embedding provider (can be overridden by HF Secrets)
24
+ # Options: huggingface (local, no key needed), google, jina
25
  ENV EMBEDDING_PROVIDER=huggingface
26
 
27
+ # Disable HF hub implicit token warning
28
+ ENV HF_HUB_DISABLE_IMPLICIT_TOKEN=1
29
+
30
  WORKDIR /app
31
 
32
  # System dependencies
huggingface/README.md CHANGED
@@ -44,12 +44,38 @@ A production-ready **Multi-Agent RAG System** that analyzes blood test biomarker
44
 
45
  ## 🔧 Configuration
46
 
47
- This Space requires an LLM API key. Add one of these secrets in Space Settings:
48
 
49
- | Secret | Provider | Get Free Key |
50
- |--------|----------|--------------|
51
- | `GROQ_API_KEY` | Groq (recommended) | [console.groq.com/keys](https://console.groq.com/keys) |
52
- | `GOOGLE_API_KEY` | Google Gemini | [aistudio.google.com](https://aistudio.google.com/app/apikey) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
  ## 🏗️ Architecture
55
 
 
44
 
45
  ## 🔧 Configuration
46
 
47
+ This Space requires at least one LLM API key. Configure secrets in **Space Settings → Secrets**.
48
 
49
+ ### Required Secrets (pick one)
50
+
51
+ | Secret | Provider | Description | Get Free Key |
52
+ |--------|----------|-------------|--------------|
53
+ | `GROQ_API_KEY` | Groq | **Recommended** - Fast, free LLaMA 3.3-70B | [console.groq.com/keys](https://console.groq.com/keys) |
54
+ | `GOOGLE_API_KEY` | Google Gemini | Free Gemini 2.0 Flash | [aistudio.google.com](https://aistudio.google.com/app/apikey) |
55
+
56
+ ### Optional: LLM Configuration
57
+
58
+ | Secret | Default | Description |
59
+ |--------|---------|-------------|
60
+ | `GROQ_MODEL` | `llama-3.3-70b-versatile` | Groq model to use |
61
+ | `GEMINI_MODEL` | `gemini-2.0-flash` | Gemini model to use |
62
+ | `LLM_PROVIDER` | auto-detected | Force provider: `groq` or `gemini` |
63
+
64
+ ### Optional: Embeddings
65
+
66
+ | Secret | Default | Description |
67
+ |--------|---------|-------------|
68
+ | `EMBEDDING_PROVIDER` | `huggingface` | Provider: `jina`, `google`, or `huggingface` |
69
+ | `JINA_API_KEY` | - | High-quality 1024d embeddings ([jina.ai](https://jina.ai/)) |
70
+
71
+ ### Optional: Observability (Langfuse)
72
+
73
+ | Secret | Description |
74
+ |--------|-------------|
75
+ | `LANGFUSE_ENABLED` | Set to `true` to enable tracing |
76
+ | `LANGFUSE_PUBLIC_KEY` | Langfuse public key |
77
+ | `LANGFUSE_SECRET_KEY` | Langfuse secret key |
78
+ | `LANGFUSE_HOST` | Langfuse host URL (e.g., `https://cloud.langfuse.com`) |
79
 
80
  ## 🏗️ Architecture
81
 
huggingface/app.py CHANGED
@@ -4,7 +4,28 @@ MediGuard AI — Hugging Face Spaces Gradio App
4
  Standalone deployment that uses:
5
  - FAISS vector store (local)
6
  - Cloud LLMs (Groq or Gemini - FREE tiers)
7
- - No external services required
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  """
9
 
10
  from __future__ import annotations
@@ -33,37 +54,122 @@ logging.basicConfig(
33
  logger = logging.getLogger("mediguard.huggingface")
34
 
35
  # ---------------------------------------------------------------------------
36
- # Configuration
37
  # ---------------------------------------------------------------------------
38
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  def get_api_keys():
40
- """Get API keys dynamically (HuggingFace injects secrets after module load)."""
41
- groq_key = os.getenv("GROQ_API_KEY", "")
42
- google_key = os.getenv("GOOGLE_API_KEY", "")
 
 
 
 
 
43
  return groq_key, google_key
44
 
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  def setup_llm_provider():
47
- """Set LLM provider based on available keys."""
 
 
 
48
  groq_key, google_key = get_api_keys()
 
49
 
50
  if groq_key:
51
  os.environ["LLM_PROVIDER"] = "groq"
52
- os.environ["GROQ_API_KEY"] = groq_key # Ensure it's set
53
- return "groq"
 
 
54
  elif google_key:
55
  os.environ["LLM_PROVIDER"] = "gemini"
56
  os.environ["GOOGLE_API_KEY"] = google_key
57
- return "gemini"
58
- return None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
 
61
  # Log status at startup (keys may not be available yet)
62
  _groq, _google = get_api_keys()
 
 
 
 
 
 
 
 
 
 
63
  if not _groq and not _google:
64
  logger.warning(
65
  "No LLM API key found at startup. Will check again when analyzing."
66
  )
 
 
 
67
 
68
 
69
  # ---------------------------------------------------------------------------
@@ -103,9 +209,11 @@ def get_guild():
103
 
104
  try:
105
  logger.info("Initializing Clinical Insight Guild...")
106
- logger.info(f"LLM_PROVIDER={os.getenv('LLM_PROVIDER')}")
107
- logger.info(f"GROQ_API_KEY={'set' if os.getenv('GROQ_API_KEY') else 'NOT SET'}")
108
- logger.info(f"GOOGLE_API_KEY={'set' if os.getenv('GOOGLE_API_KEY') else 'NOT SET'}")
 
 
109
 
110
  start = time.time()
111
 
@@ -191,10 +299,25 @@ def analyze_biomarkers(input_text: str, progress=gr.Progress()) -> tuple[str, st
191
  <div style="background: linear-gradient(135deg, #fee2e2 0%, #fecaca 100%); border: 1px solid #ef4444; border-radius: 10px; padding: 16px;">
192
  <strong style="color: #dc2626;">❌ No API Key Configured</strong>
193
  <p style="margin: 12px 0 8px 0; color: #991b1b;">Please add your API key in Space Settings → Secrets:</p>
194
- <ul style="margin: 0; color: #7f1d1d;">
195
- <li><code>GROQ_API_KEY</code> - <a href="https://console.groq.com/keys" target="_blank" style="color: #2563eb;">Get free key →</a></li>
196
- <li><code>GOOGLE_API_KEY</code> - <a href="https://aistudio.google.com/app/apikey" target="_blank" style="color: #2563eb;">Get free key →</a></li>
197
- </ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
198
  </div>
199
  """
200
 
@@ -837,6 +960,11 @@ def create_demo() -> gr.Blocks:
837
  <strong>Setup Required:</strong> Add your <code>GROQ_API_KEY</code> or
838
  <code>GOOGLE_API_KEY</code> in Space Settings → Secrets to enable analysis.
839
  <a href="https://console.groq.com/keys" target="_blank" style="color: #2563eb;">Get free Groq key →</a>
 
 
 
 
 
840
  </div>
841
  </div>
842
  """)
@@ -999,7 +1127,10 @@ def create_demo() -> gr.Blocks:
999
  <a href="https://faiss.ai/" target="_blank" style="color: #3b82f6;">FAISS</a>, and
1000
  <a href="https://gradio.app/" target="_blank" style="color: #3b82f6;">Gradio</a>
1001
  </p>
1002
- <p style="margin-top: 8px;">Powered by <strong>Groq</strong> (LLaMA 3.3-70B) • Open Source on GitHub</p>
 
 
 
1003
  </div>
1004
  """)
1005
 
 
4
  Standalone deployment that uses:
5
  - FAISS vector store (local)
6
  - Cloud LLMs (Groq or Gemini - FREE tiers)
7
+ - Multiple embedding providers (Jina, Google, HuggingFace)
8
+ - Optional Langfuse observability
9
+
10
+ Environment Variables (HuggingFace Secrets):
11
+ Required (pick one):
12
+ - GROQ_API_KEY: Groq API key (recommended, free)
13
+ - GOOGLE_API_KEY: Google Gemini API key (free)
14
+
15
+ Optional - LLM Configuration:
16
+ - LLM_PROVIDER: "groq" or "gemini" (auto-detected from keys)
17
+ - GROQ_MODEL: Model name (default: llama-3.3-70b-versatile)
18
+ - GEMINI_MODEL: Model name (default: gemini-2.0-flash)
19
+
20
+ Optional - Embeddings:
21
+ - EMBEDDING_PROVIDER: "jina", "google", or "huggingface" (default: huggingface)
22
+ - JINA_API_KEY: Jina AI API key for high-quality embeddings
23
+
24
+ Optional - Observability:
25
+ - LANGFUSE_ENABLED: "true" to enable tracing
26
+ - LANGFUSE_PUBLIC_KEY: Langfuse public key
27
+ - LANGFUSE_SECRET_KEY: Langfuse secret key
28
+ - LANGFUSE_HOST: Langfuse host URL
29
  """
30
 
31
  from __future__ import annotations
 
54
  logger = logging.getLogger("mediguard.huggingface")
55
 
56
  # ---------------------------------------------------------------------------
57
+ # Configuration - Environment Variable Helpers
58
  # ---------------------------------------------------------------------------
59
 
60
+ def _get_env(primary: str, *fallbacks, default: str = "") -> str:
61
+ """Get env var with multiple fallback names for compatibility."""
62
+ value = os.getenv(primary)
63
+ if value:
64
+ return value
65
+ for fb in fallbacks:
66
+ value = os.getenv(fb)
67
+ if value:
68
+ return value
69
+ return default
70
+
71
+
72
  def get_api_keys():
73
+ """Get API keys dynamically (HuggingFace injects secrets after module load).
74
+
75
+ Supports both simple and nested naming conventions:
76
+ - GROQ_API_KEY / LLM__GROQ_API_KEY
77
+ - GOOGLE_API_KEY / LLM__GOOGLE_API_KEY
78
+ """
79
+ groq_key = _get_env("GROQ_API_KEY", "LLM__GROQ_API_KEY")
80
+ google_key = _get_env("GOOGLE_API_KEY", "LLM__GOOGLE_API_KEY")
81
  return groq_key, google_key
82
 
83
 
84
+ def get_jina_api_key() -> str:
85
+ """Get Jina API key for embeddings."""
86
+ return _get_env("JINA_API_KEY", "EMBEDDING__JINA_API_KEY")
87
+
88
+
89
+ def get_embedding_provider() -> str:
90
+ """Get configured embedding provider."""
91
+ return _get_env("EMBEDDING_PROVIDER", "EMBEDDING__PROVIDER", default="huggingface")
92
+
93
+
94
+ def get_groq_model() -> str:
95
+ """Get configured Groq model name."""
96
+ return _get_env("GROQ_MODEL", "LLM__GROQ_MODEL", default="llama-3.3-70b-versatile")
97
+
98
+
99
+ def get_gemini_model() -> str:
100
+ """Get configured Gemini model name."""
101
+ return _get_env("GEMINI_MODEL", "LLM__GEMINI_MODEL", default="gemini-2.0-flash")
102
+
103
+
104
+ def is_langfuse_enabled() -> bool:
105
+ """Check if Langfuse observability is enabled."""
106
+ enabled = _get_env("LANGFUSE_ENABLED", "LANGFUSE__ENABLED", default="false")
107
+ return enabled.lower() in ("true", "1", "yes")
108
+
109
+
110
  def setup_llm_provider():
111
+ """Set up LLM provider and related configuration based on available keys.
112
+
113
+ Sets environment variables for the entire application to use.
114
+ """
115
  groq_key, google_key = get_api_keys()
116
+ provider = None
117
 
118
  if groq_key:
119
  os.environ["LLM_PROVIDER"] = "groq"
120
+ os.environ["GROQ_API_KEY"] = groq_key
121
+ os.environ["GROQ_MODEL"] = get_groq_model()
122
+ provider = "groq"
123
+ logger.info(f"Configured Groq provider with model: {get_groq_model()}")
124
  elif google_key:
125
  os.environ["LLM_PROVIDER"] = "gemini"
126
  os.environ["GOOGLE_API_KEY"] = google_key
127
+ os.environ["GEMINI_MODEL"] = get_gemini_model()
128
+ provider = "gemini"
129
+ logger.info(f"Configured Gemini provider with model: {get_gemini_model()}")
130
+
131
+ # Set up embedding provider
132
+ embedding_provider = get_embedding_provider()
133
+ os.environ["EMBEDDING_PROVIDER"] = embedding_provider
134
+
135
+ # If Jina is configured, set the API key
136
+ jina_key = get_jina_api_key()
137
+ if jina_key:
138
+ os.environ["JINA_API_KEY"] = jina_key
139
+ os.environ["EMBEDDING__JINA_API_KEY"] = jina_key
140
+ logger.info("Jina embeddings configured")
141
+
142
+ # Set up Langfuse if enabled
143
+ if is_langfuse_enabled():
144
+ os.environ["LANGFUSE__ENABLED"] = "true"
145
+ for var in ["LANGFUSE_PUBLIC_KEY", "LANGFUSE_SECRET_KEY", "LANGFUSE_HOST"]:
146
+ val = _get_env(var, f"LANGFUSE__{var.split('_', 1)[1]}")
147
+ if val:
148
+ os.environ[var] = val
149
+ logger.info("Langfuse observability enabled")
150
+
151
+ return provider
152
 
153
 
154
  # Log status at startup (keys may not be available yet)
155
  _groq, _google = get_api_keys()
156
+ _jina = get_jina_api_key()
157
+ logger.info("=" * 60)
158
+ logger.info("MediGuard AI — HuggingFace Space Starting")
159
+ logger.info("=" * 60)
160
+ logger.info(f"GROQ_API_KEY: {'✓ configured' if _groq else '✗ not set'}")
161
+ logger.info(f"GOOGLE_API_KEY: {'✓ configured' if _google else '✗ not set'}")
162
+ logger.info(f"JINA_API_KEY: {'✓ configured' if _jina else '✗ not set (using HuggingFace embeddings)'}")
163
+ logger.info(f"EMBEDDING_PROVIDER: {get_embedding_provider()}")
164
+ logger.info(f"LANGFUSE: {'✓ enabled' if is_langfuse_enabled() else '✗ disabled'}")
165
+
166
  if not _groq and not _google:
167
  logger.warning(
168
  "No LLM API key found at startup. Will check again when analyzing."
169
  )
170
+ else:
171
+ logger.info("LLM API key available — ready for analysis")
172
+ logger.info("=" * 60)
173
 
174
 
175
  # ---------------------------------------------------------------------------
 
209
 
210
  try:
211
  logger.info("Initializing Clinical Insight Guild...")
212
+ logger.info(f" LLM_PROVIDER: {os.getenv('LLM_PROVIDER', 'not set')}")
213
+ logger.info(f" GROQ_API_KEY: {'set' if os.getenv('GROQ_API_KEY') else ' not set'}")
214
+ logger.info(f" GOOGLE_API_KEY: {'set' if os.getenv('GOOGLE_API_KEY') else ' not set'}")
215
+ logger.info(f" EMBEDDING_PROVIDER: {os.getenv('EMBEDDING_PROVIDER', 'huggingface')}")
216
+ logger.info(f" JINA_API_KEY: {'✓ set' if os.getenv('JINA_API_KEY') else '✗ not set'}")
217
 
218
  start = time.time()
219
 
 
299
  <div style="background: linear-gradient(135deg, #fee2e2 0%, #fecaca 100%); border: 1px solid #ef4444; border-radius: 10px; padding: 16px;">
300
  <strong style="color: #dc2626;">❌ No API Key Configured</strong>
301
  <p style="margin: 12px 0 8px 0; color: #991b1b;">Please add your API key in Space Settings → Secrets:</p>
302
+
303
+ <div style="margin: 12px 0;">
304
+ <strong style="color: #374151;">Required (pick one):</strong>
305
+ <ul style="margin: 4px 0; color: #7f1d1d;">
306
+ <li><code>GROQ_API_KEY</code> - <a href="https://console.groq.com/keys" target="_blank" style="color: #2563eb;">Get free key →</a> (Recommended)</li>
307
+ <li><code>GOOGLE_API_KEY</code> - <a href="https://aistudio.google.com/app/apikey" target="_blank" style="color: #2563eb;">Get free key →</a></li>
308
+ </ul>
309
+ </div>
310
+
311
+ <details style="margin-top: 12px;">
312
+ <summary style="cursor: pointer; color: #374151; font-weight: 600;">Optional configuration secrets</summary>
313
+ <ul style="margin: 8px 0; color: #6b7280; font-size: 0.9em;">
314
+ <li><code>GROQ_MODEL</code> - Model name (default: llama-3.3-70b-versatile)</li>
315
+ <li><code>GEMINI_MODEL</code> - Model name (default: gemini-2.0-flash)</li>
316
+ <li><code>JINA_API_KEY</code> - High-quality embeddings (optional)</li>
317
+ <li><code>EMBEDDING_PROVIDER</code> - jina, google, or huggingface</li>
318
+ <li><code>LANGFUSE_ENABLED</code> - Enable observability tracing</li>
319
+ </ul>
320
+ </details>
321
  </div>
322
  """
323
 
 
960
  <strong>Setup Required:</strong> Add your <code>GROQ_API_KEY</code> or
961
  <code>GOOGLE_API_KEY</code> in Space Settings → Secrets to enable analysis.
962
  <a href="https://console.groq.com/keys" target="_blank" style="color: #2563eb;">Get free Groq key →</a>
963
+ <br>
964
+ <span style="font-size: 0.9em; color: #64748b;">
965
+ Optional: Configure <code>JINA_API_KEY</code> for high-quality embeddings,
966
+ <code>LANGFUSE_ENABLED=true</code> for observability.
967
+ </span>
968
  </div>
969
  </div>
970
  """)
 
1127
  <a href="https://faiss.ai/" target="_blank" style="color: #3b82f6;">FAISS</a>, and
1128
  <a href="https://gradio.app/" target="_blank" style="color: #3b82f6;">Gradio</a>
1129
  </p>
1130
+ <p style="margin-top: 8px;">
1131
+ Powered by <strong>Groq</strong> or <strong>Google Gemini</strong> •
1132
+ <a href="https://github.com" target="_blank" style="color: #3b82f6;">Open Source on GitHub</a>
1133
+ </p>
1134
  </div>
1135
  """)
1136
 
huggingface/requirements.txt CHANGED
@@ -33,10 +33,13 @@ pypdf>=4.0.0
33
  pydantic>=2.9.0
34
  pydantic-settings>=2.5.0
35
 
36
- # --- HTTP Client ---
37
  httpx>=0.27.0
38
 
39
  # --- Utilities ---
40
  python-dotenv>=1.0.0
41
  tenacity>=8.0.0
42
  numpy<2.0.0
 
 
 
 
33
  pydantic>=2.9.0
34
  pydantic-settings>=2.5.0
35
 
36
+ # --- HTTP Client (for Jina AI embeddings) ---
37
  httpx>=0.27.0
38
 
39
  # --- Utilities ---
40
  python-dotenv>=1.0.0
41
  tenacity>=8.0.0
42
  numpy<2.0.0
43
+
44
+ # --- Observability (optional, for Langfuse support) ---
45
+ langfuse>=2.0.0
src/llm_config.py CHANGED
@@ -6,6 +6,10 @@ Supports multiple providers:
6
  - Groq (FREE, fast, llama-3.3-70b) - RECOMMENDED
7
  - Google Gemini (FREE tier)
8
  - Ollama (local, for offline use)
 
 
 
 
9
  """
10
 
11
  import os
@@ -20,9 +24,39 @@ load_dotenv()
20
  os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT", "MediGuard_AI_RAG_Helper")
21
 
22
 
 
 
 
 
 
23
  def get_default_llm_provider() -> str:
24
- """Get default LLM provider dynamically from environment."""
25
- return os.getenv("LLM_PROVIDER", "groq")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
 
28
  # For backward compatibility (but prefer using get_default_llm_provider())
@@ -53,15 +87,15 @@ def get_chat_model(
53
  if provider == "groq":
54
  from langchain_groq import ChatGroq
55
 
56
- api_key = os.getenv("GROQ_API_KEY")
57
  if not api_key:
58
  raise ValueError(
59
  "GROQ_API_KEY not found in environment.\n"
60
  "Get your FREE API key at: https://console.groq.com/keys"
61
  )
62
 
63
- # Default to llama-3.3-70b for best quality (free on Groq)
64
- model = model or "llama-3.3-70b-versatile"
65
 
66
  return ChatGroq(
67
  model=model,
@@ -73,15 +107,15 @@ def get_chat_model(
73
  elif provider == "gemini":
74
  from langchain_google_genai import ChatGoogleGenerativeAI
75
 
76
- api_key = os.getenv("GOOGLE_API_KEY")
77
  if not api_key:
78
  raise ValueError(
79
  "GOOGLE_API_KEY not found in environment.\n"
80
  "Get your FREE API key at: https://aistudio.google.com/app/apikey"
81
  )
82
 
83
- # Default to Gemini 2.0 Flash (fast and free)
84
- model = model or "gemini-2.0-flash"
85
 
86
  return ChatGoogleGenerativeAI(
87
  model=model,
@@ -108,22 +142,47 @@ def get_chat_model(
108
  raise ValueError(f"Unknown provider: {provider}. Use 'groq', 'gemini', or 'ollama'")
109
 
110
 
111
- def get_embedding_model(provider: Optional[Literal["google", "huggingface", "ollama"]] = None):
 
 
 
 
 
112
  """
113
  Get embedding model for vector search.
114
 
115
  Args:
116
- provider: "google" (free, recommended), "huggingface" (local), or "ollama" (local)
117
 
118
  Returns:
119
  LangChain embedding model instance
 
 
 
 
120
  """
121
- provider = provider or os.getenv("EMBEDDING_PROVIDER", "google")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
 
123
- if provider == "google":
124
  from langchain_google_genai import GoogleGenerativeAIEmbeddings
125
 
126
- api_key = os.getenv("GOOGLE_API_KEY")
127
  if not api_key:
128
  print("WARN: GOOGLE_API_KEY not found. Falling back to HuggingFace embeddings.")
129
  return get_embedding_model("huggingface")
 
6
  - Groq (FREE, fast, llama-3.3-70b) - RECOMMENDED
7
  - Google Gemini (FREE tier)
8
  - Ollama (local, for offline use)
9
+
10
+ Environment Variables (supports both naming conventions):
11
+ - Simple: GROQ_API_KEY, GOOGLE_API_KEY, LLM_PROVIDER, GROQ_MODEL, etc.
12
+ - Nested: LLM__GROQ_API_KEY, LLM__GOOGLE_API_KEY, LLM__PROVIDER, etc.
13
  """
14
 
15
  import os
 
24
  os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT", "MediGuard_AI_RAG_Helper")
25
 
26
 
27
+ def _get_env_with_fallback(primary: str, fallback: str, default: str = "") -> str:
28
+ """Get env var with fallback to alternate naming convention."""
29
+ return os.getenv(primary) or os.getenv(fallback) or default
30
+
31
+
32
  def get_default_llm_provider() -> str:
33
+ """Get default LLM provider dynamically from environment.
34
+
35
+ Supports both naming conventions:
36
+ - LLM_PROVIDER (simple)
37
+ - LLM__PROVIDER (pydantic nested)
38
+ """
39
+ return _get_env_with_fallback("LLM_PROVIDER", "LLM__PROVIDER", "groq")
40
+
41
+
42
+ def get_groq_api_key() -> str:
43
+ """Get Groq API key from environment (supports both naming conventions)."""
44
+ return _get_env_with_fallback("GROQ_API_KEY", "LLM__GROQ_API_KEY", "")
45
+
46
+
47
+ def get_google_api_key() -> str:
48
+ """Get Google API key from environment (supports both naming conventions)."""
49
+ return _get_env_with_fallback("GOOGLE_API_KEY", "LLM__GOOGLE_API_KEY", "")
50
+
51
+
52
+ def get_groq_model() -> str:
53
+ """Get Groq model from environment (supports both naming conventions)."""
54
+ return _get_env_with_fallback("GROQ_MODEL", "LLM__GROQ_MODEL", "llama-3.3-70b-versatile")
55
+
56
+
57
+ def get_gemini_model() -> str:
58
+ """Get Gemini model from environment (supports both naming conventions)."""
59
+ return _get_env_with_fallback("GEMINI_MODEL", "LLM__GEMINI_MODEL", "gemini-2.0-flash")
60
 
61
 
62
  # For backward compatibility (but prefer using get_default_llm_provider())
 
87
  if provider == "groq":
88
  from langchain_groq import ChatGroq
89
 
90
+ api_key = get_groq_api_key()
91
  if not api_key:
92
  raise ValueError(
93
  "GROQ_API_KEY not found in environment.\n"
94
  "Get your FREE API key at: https://console.groq.com/keys"
95
  )
96
 
97
+ # Use model from environment or default
98
+ model = model or get_groq_model()
99
 
100
  return ChatGroq(
101
  model=model,
 
107
  elif provider == "gemini":
108
  from langchain_google_genai import ChatGoogleGenerativeAI
109
 
110
+ api_key = get_google_api_key()
111
  if not api_key:
112
  raise ValueError(
113
  "GOOGLE_API_KEY not found in environment.\n"
114
  "Get your FREE API key at: https://aistudio.google.com/app/apikey"
115
  )
116
 
117
+ # Use model from environment or default
118
+ model = model or get_gemini_model()
119
 
120
  return ChatGoogleGenerativeAI(
121
  model=model,
 
142
  raise ValueError(f"Unknown provider: {provider}. Use 'groq', 'gemini', or 'ollama'")
143
 
144
 
145
+ def get_embedding_provider() -> str:
146
+ """Get embedding provider from environment (supports both naming conventions)."""
147
+ return _get_env_with_fallback("EMBEDDING_PROVIDER", "EMBEDDING__PROVIDER", "huggingface")
148
+
149
+
150
+ def get_embedding_model(provider: Optional[Literal["jina", "google", "huggingface", "ollama"]] = None):
151
  """
152
  Get embedding model for vector search.
153
 
154
  Args:
155
+ provider: "jina" (high-quality), "google" (free), "huggingface" (local), or "ollama" (local)
156
 
157
  Returns:
158
  LangChain embedding model instance
159
+
160
+ Note:
161
+ For production use, prefer src.services.embeddings.service.make_embedding_service()
162
+ which has automatic fallback chain: Jina → Google → HuggingFace.
163
  """
164
+ provider = provider or get_embedding_provider()
165
+
166
+ if provider == "jina":
167
+ # Try Jina AI embeddings first (high quality, 1024d)
168
+ jina_key = _get_env_with_fallback("JINA_API_KEY", "EMBEDDING__JINA_API_KEY", "")
169
+ if jina_key:
170
+ try:
171
+ # Use the embedding service for Jina
172
+ from src.services.embeddings.service import make_embedding_service
173
+ return make_embedding_service()
174
+ except Exception as e:
175
+ print(f"WARN: Jina embeddings failed: {e}")
176
+ print("INFO: Falling back to Google embeddings...")
177
+ return get_embedding_model("google")
178
+ else:
179
+ print("WARN: JINA_API_KEY not found. Falling back to Google embeddings.")
180
+ return get_embedding_model("google")
181
 
182
+ elif provider == "google":
183
  from langchain_google_genai import GoogleGenerativeAIEmbeddings
184
 
185
+ api_key = get_google_api_key()
186
  if not api_key:
187
  print("WARN: GOOGLE_API_KEY not found. Falling back to HuggingFace embeddings.")
188
  return get_embedding_model("huggingface")