Laborator commited on
Commit
d0abc67
·
1 Parent(s): 40d108c

Restore 3-panel layout: vanilla / brief / rich with microlens-final LoRA

Browse files

Reattach the fine-tuned LoRA (now Laborator/microlens-final, single adapter
instead of v2/v3 subfolders) and rebuild the 3-result-panel UI:

- UNTRAINED BASELINE: stock Gemma 4 E2B with adapter disabled
- MICROLENS BRIEF: base + LoRA + BRIEF_PROMPT (one-sentence answer)
- MICROLENS RICH: base + LoRA + RICH_PROMPT (full schema)

Brief and rich share the same microlens-final adapter and differ only in the
prompt and the max_new_tokens budget (96 vs 512).

- _HF_LORA_REPO → Laborator/microlens-final
- _zerogpu_infer_all: per-version prompt, single adapter swap
- PANEL_THEMES["v2"/"v3"]: rename titles to MICROLENS BRIEF/RICH, update
subtitle to reflect 95 genera / single-LoRA architecture
- APK QR fallback → GitHub repo
- README: drop base-only Status, add models: Laborator/microlens-final,
switch genus claim to 95 (diatoms + fungal spores)

Files changed (2) hide show
  1. README.md +10 -9
  2. app.py +226 -68
README.md CHANGED
@@ -8,7 +8,9 @@ sdk_version: 5.7.1
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
- short_description: Microscopy AI · base Gemma 4 E2B · LoRA in training
 
 
12
  tags:
13
  - microscopy
14
  - biology
@@ -21,25 +23,23 @@ tags:
21
 
22
  # 🔬 MicroLens — Microscopy AI Demo
23
 
24
- Live demo of **MicroLens**: a microscopy-focused vision-language interface built on **Gemma 4 E2B**.
25
 
26
  Built by **Serghei Brinza** (Vienna, Austria) for the **Kaggle Gemma 4 Good Hackathon** (May 2026).
27
 
28
  ---
29
 
30
- ## Status
31
-
32
- ⚠️ Currently running on stock **Gemma 4 E2B** only — the fine-tuned LoRA is in training (expected restoration after 2026-05-17). Accuracy on microscopy genera will be limited until the LoRA is loaded.
33
-
34
- ---
35
-
36
  ## 🚀 How to use
37
 
38
  1. Upload a microscopy image (or pick one from the examples gallery).
39
  2. Optionally add a question, for example: "What is shown here?", "Describe the morphology", "What category does this belong to?"
40
  3. Click **AI Analyze**.
41
 
42
- The model returns a natural-language description of the image. While the fine-tune is in training the base model gives a generic vision answer rather than genus-level identification.
 
 
 
 
43
 
44
  ---
45
 
@@ -64,6 +64,7 @@ For any vertical that requires regulatory clearance (medicine, veterinary, foren
64
 
65
  ## 🔗 Links
66
 
 
67
  - **Source code:** [SergheiBrinza/microlens](https://github.com/SergheiBrinza/microlens)
68
  - **Hackathon writeup:** [KAGGLE_WRITEUP.md](https://github.com/SergheiBrinza/microlens/blob/main/KAGGLE_WRITEUP.md)
69
  - **Roadmap:** [ROADMAP.md](https://github.com/SergheiBrinza/microlens/blob/main/ROADMAP.md)
 
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
+ short_description: Gemma 4 E2B + LoRA · diatoms & fungal spores · 95 genera
12
+ models:
13
+ - Laborator/microlens-final
14
  tags:
15
  - microscopy
16
  - biology
 
23
 
24
  # 🔬 MicroLens — Microscopy AI Demo
25
 
26
+ Live demo of **MicroLens**: a fine-tuned **Gemma 4 E2B** vision-language model that identifies microscopic subjects across **2 categories** (diatoms and fungal spores) and **95 genera**.
27
 
28
  Built by **Serghei Brinza** (Vienna, Austria) for the **Kaggle Gemma 4 Good Hackathon** (May 2026).
29
 
30
  ---
31
 
 
 
 
 
 
 
32
  ## 🚀 How to use
33
 
34
  1. Upload a microscopy image (or pick one from the examples gallery).
35
  2. Optionally add a question, for example: "What is shown here?", "Describe the morphology", "What category does this belong to?"
36
  3. Click **AI Analyze**.
37
 
38
+ You get three side-by-side answers:
39
+
40
+ - **UNTRAINED BASELINE** — stock Gemma 4 E2B with no microscopy training.
41
+ - **MICROLENS · BRIEF** — same base + `Laborator/microlens-final` LoRA, prompted for a single-sentence genus answer.
42
+ - **MICROLENS · RICH** — same base + same LoRA, prompted for the full schema (genus + morphology + habitat + identification cues).
43
 
44
  ---
45
 
 
64
 
65
  ## 🔗 Links
66
 
67
+ - **Model:** [Laborator/microlens-final](https://huggingface.co/Laborator/microlens-final)
68
  - **Source code:** [SergheiBrinza/microlens](https://github.com/SergheiBrinza/microlens)
69
  - **Hackathon writeup:** [KAGGLE_WRITEUP.md](https://github.com/SergheiBrinza/microlens/blob/main/KAGGLE_WRITEUP.md)
70
  - **Roadmap:** [ROADMAP.md](https://github.com/SergheiBrinza/microlens/blob/main/ROADMAP.md)
app.py CHANGED
@@ -8,14 +8,17 @@ Layout:
8
  control panel (mode-dependent: 5 categories × 6 thumbs / upload zone /
9
  camera enumeration)
10
  - AI ANALYZE long oval cyan→red gradient button
 
11
  - Translate row with 28 languages (English default) + ORIGINAL button after
12
  translation
13
  - Footer with run-locally + APK + Legal links
14
 
15
- SAMPLES tab uses cached vanilla answers from catalog.json.
16
- UPLOAD / MICROSCOPE tabs run LIVE inference against the vanilla backend URL:
17
  URL_VANILLA (default http://127.0.0.1:8085/v1/chat/completions)
18
- On HF Space deployment configure this as a Variable to point at a public tunnel
 
 
19
  (e.g. Cloudflare → llama-server). When unreachable the panel shows a clean
20
  "backend unavailable" message instead of crashing.
21
  """
@@ -86,17 +89,22 @@ CATALOG: List[Dict] = json.loads(CATALOG_PATH.read_text())
86
  BY_FILENAME = {s["filename"]: s for s in CATALOG}
87
 
88
  URL_VANILLA = os.environ.get("URL_VANILLA", "http://127.0.0.1:8085/v1/chat/completions")
 
 
89
  INFERENCE_PROMPT = "What is shown in this microscope image?"
 
 
 
90
 
91
  # ─────────────────────────────────────────────────────────────────────────────
92
  # ZeroGPU runtime: when running on HF Space we replace HTTP llama-server calls
93
- # with in-process transformers inference on H200 against the stock Gemma 4 E2B
94
- # base model. The fine-tuned LoRA adapters are currently in training and not
95
- # attached. Outside HF Space (local dev) the original HTTP path is preserved.
96
  # ─────────────────────────────────────────────────────────────────────────────
97
  IS_HF_SPACE = bool(os.environ.get("SPACE_ID"))
98
 
99
  _HF_BASE = "unsloth/gemma-4-E2B-it"
 
100
 
101
  _zerogpu_processor = None
102
  _zerogpu_model = None
@@ -105,23 +113,96 @@ if IS_HF_SPACE:
105
  import spaces
106
  import torch
107
  from transformers import AutoProcessor, AutoModelForImageTextToText
 
108
 
109
  print("[ZeroGPU] loading processor + base model on cuda…", flush=True)
110
  _zerogpu_processor = AutoProcessor.from_pretrained(_HF_BASE)
111
  _zerogpu_model = AutoModelForImageTextToText.from_pretrained(
112
  _HF_BASE, torch_dtype=torch.bfloat16, device_map="cuda",
113
  )
114
- _zerogpu_model.eval()
115
- print("[ZeroGPU] ready (base Gemma 4 E2B only — LoRA in training)", flush=True)
116
 
117
- # ── Single inference path: stock base model, one GPU acquisition per click.
118
- # duration=30: vanilla can ramble up to 20+s on long answers; 30s budget
119
- # leaves headroom without overspending ZeroGPU quota.
120
- @spaces.GPU(duration=30)
121
- def _zerogpu_infer(image_data_uri: str, prompt: str) -> str:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
  import time as _t
123
  t0 = _t.time()
124
- print(f"[infer] cuda={torch.cuda.is_available()} "
125
  f"dev={torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'cpu'}",
126
  flush=True)
127
  b64 = _strip_data_uri(image_data_uri) if image_data_uri.startswith("data:") else image_data_uri
@@ -129,9 +210,17 @@ if IS_HF_SPACE:
129
  if max(img.size) > 768:
130
  img.thumbnail((768, 768))
131
  print(f"[infer] image {img.size}", flush=True)
 
 
 
 
 
 
 
 
132
  messages = [{"role": "user", "content": [
133
  {"type": "image", "image": img},
134
- {"type": "text", "text": prompt},
135
  ]}]
136
  inputs = _zerogpu_processor.apply_chat_template(
137
  messages, add_generation_prompt=True, tokenize=True,
@@ -144,7 +233,7 @@ if IS_HF_SPACE:
144
  print(f"[infer] inputs ready, t+{_t.time()-t0:.2f}s, generating…", flush=True)
145
  with torch.inference_mode():
146
  out = _zerogpu_model.generate(
147
- **inputs, max_new_tokens=256, do_sample=False,
148
  )
149
  prompt_len = inputs["input_ids"].shape[1]
150
  gen_ids = out[0][prompt_len:]
@@ -153,14 +242,14 @@ if IS_HF_SPACE:
153
  f"text_len={len(text)}, preview={text[:80]!r}", flush=True)
154
  return text.strip()
155
 
 
 
156
 
157
  # ─────────────────────────────────────────────────────────────────────────────
158
  # QR codes for the footer install card. Generated once at module load.
159
  # ─────────────────────────────────────────────────────────────────────────────
 
160
  GITHUB_URL = "https://github.com/SergheiBrinza/microlens"
161
- # APK fallback now points at the GitHub repo (the previous resolve/main URL
162
- # pointed at the deleted Laborator/microlens-gemma4-e2b LoRA repo).
163
- APK_URL = GITHUB_URL
164
 
165
  def _qr_data_uri(data: str, dark: str = "#FFFFFF", light: str = "#000000",
166
  alpha: float = 1.0) -> str:
@@ -235,11 +324,12 @@ def llama_server_call(url: str, image_data_uri: str,
235
  prompt: str = INFERENCE_PROMPT,
236
  timeout: int = 180) -> Tuple[str, Optional[str]]:
237
  """Returns (text, error_or_None).
238
- On HF Space: routes to in-process ZeroGPU inference (transformers).
239
  Locally: OpenAI-compatible call to llama-server (original behavior)."""
240
  if IS_HF_SPACE:
 
241
  try:
242
- return _zerogpu_infer(image_data_uri, prompt), None
243
  except Exception as e:
244
  return "", f"{type(e).__name__}: {str(e)[:240]}"
245
  payload = {
@@ -1141,8 +1231,8 @@ async () => {
1141
 
1142
  PANEL_THEMES = {
1143
  "vanilla": {
1144
- "title": "GEMMA 4 E2B · BASELINE",
1145
- "subtitle": "Stock Gemma 4 E2B · Google factory weights · LoRA fine-tune in training",
1146
  "stripe": "linear-gradient(90deg, #C0C5CC 0%, #7A7E85 100%)",
1147
  "title_grad": "linear-gradient(90deg, #E0E5EC 0%, #9A9EA5 100%)",
1148
  "border": "rgba(180,185,195,0.35)",
@@ -1150,6 +1240,26 @@ PANEL_THEMES = {
1150
  "glow_strong": "0 0 56px rgba(200,205,215,0.28)",
1151
  "subtitle_color": "#9aa0a8",
1152
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1153
  }
1154
 
1155
 
@@ -1206,29 +1316,43 @@ def panel_html(kind: str, body: str, state: str = "ready", footer_text: Optional
1206
  """
1207
 
1208
 
1209
- def empty_panels(reason: str = "empty") -> str:
1210
- return panel_html("vanilla", "", state=reason)
 
 
1211
 
1212
 
1213
  def analyse_curated(filename: str, shape: str, grid: int = 0, cross: int = 0):
1214
  import time
1215
  s = BY_FILENAME.get(filename)
1216
  if not s:
1217
- yield viewport_html(None, shape, grid, cross), empty_panels()
1218
  return
1219
  vp = viewport_html(full_uri(filename), shape, grid, cross)
1220
  vanilla_full = s.get("vanilla_answer", "—")
1221
- yield vp, panel_html("vanilla", "", state="typing")
 
 
 
 
 
1222
  step = 8
1223
  delay = 0.040
1224
- for i in range(step, len(vanilla_full) + step, step):
1225
  yield (
1226
  vp,
1227
  panel_html("vanilla", vanilla_full[:min(i, len(vanilla_full))],
1228
  state="typing" if i < len(vanilla_full) else "ready"),
 
 
 
 
1229
  )
1230
  time.sleep(delay)
1231
- yield vp, panel_html("vanilla", vanilla_full)
 
 
 
1232
 
1233
 
1234
  CSS = """
@@ -1560,7 +1684,7 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1560
  font-weight:700; font-size:12px; letter-spacing:3px;
1561
  text-transform:uppercase;">Fine-tune</span>
1562
  <span style="font-family:'Fraunces',serif; font-weight:500;
1563
- color:#fff; font-size:19px; letter-spacing:0.3px;">Unsloth QLoRA &middot; in training</span>
1564
  </span>
1565
  <span style="color:#3a3a3a; font-size:18px;">&middot;</span>
1566
  <span style="display:inline-flex; align-items:baseline; gap:12px;
@@ -1584,8 +1708,8 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1584
  cross_state = gr.Textbox(value="0", elem_id="hidden-cross",
1585
  elem_classes=["ml-hidden"], show_label=False)
1586
  viewport_uri = gr.State(value="")
1587
- # Most recent answer from the panel (any mode) — translate reads from here
1588
- last_answers = gr.State(value={"vanilla": ""})
1589
 
1590
  # Toolbar — full-width above both columns (no empty space in right column)
1591
  mode_buttons = gr.HTML(value=mode_buttons_html(MODE_SAMPLES))
@@ -1650,11 +1774,10 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1650
 
1651
  gr.HTML('<div style="height:28px;"></div>')
1652
 
1653
- with gr.Row(elem_classes=["equal-panels"]):
1654
- gr.HTML('<div></div>')
1655
- with gr.Column(scale=8, min_width=520):
1656
- vanilla_panel = gr.HTML(value=panel_html("vanilla", "", state="empty"))
1657
- gr.HTML('<div></div>')
1658
 
1659
  gr.HTML(f"""
1660
  <div style="margin-top: 32px; padding: 22px 28px;
@@ -1690,7 +1813,7 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1690
  <div style="color:#e4e4e4; font-size:13px; line-height:1.85; font-weight:500;">
1691
  Gemma 4 E2B-it <span style="color:#666;font-weight:400;">&middot;</span> Google DeepMind<br>
1692
  Unsloth FastVisionModel <span style="color:#666;font-weight:400;">&middot;</span> 4-bit QLoRA<br>
1693
- Stock Gemma 4 E2B baseline <span style="color:#666;font-weight:400;">&middot;</span> LoRA fine-tune in training<br>
1694
  llama.cpp + mtmd vision extension
1695
  </div>
1696
  </div>
@@ -1706,7 +1829,7 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1706
  font-size:12.5px; letter-spacing:0.3px;
1707
  border-bottom:1px solid rgba(127,232,227,.40);
1708
  display:inline-block; margin-bottom: 8px;">
1709
- &#x1F999; MicroLens models on Ollama Hub &nbsp;&#8599;</a>
1710
  <br>
1711
  <a href="https://github.com/SergheiBrinza/microlens"
1712
  target="_blank" rel="noopener"
@@ -1773,6 +1896,8 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1773
 
1774
  LIVE_BACKENDS = [
1775
  ("vanilla", URL_VANILLA, "Gemma 4 E2B · base"),
 
 
1776
  ]
1777
 
1778
  def render_tools(current_uri, shape, grid_str, cross_str, mode):
@@ -1810,14 +1935,14 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1810
  gr.Group(visible=(mode == MODE_SAMPLES)),
1811
  gr.Group(visible=(mode == MODE_UPLOAD)),
1812
  gr.Group(visible=(mode == MODE_MICRO)),
1813
- vp, uri, empty_panels(),
1814
  gr.Button(visible=False))
1815
 
1816
  mode_state.change(on_mode_change,
1817
  [mode_state, shape_state, grid_state, cross_state, picked_filename],
1818
  [mode_buttons, samples_group, upload_group, micro_group,
1819
  viewport, viewport_uri,
1820
- vanilla_panel, original_btn], api_name=False)
1821
 
1822
  def on_cat_change(cat_label, current_filename, shape, grid_str, cross_str):
1823
  try: grid = int(grid_str or "0")
@@ -1828,14 +1953,14 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1828
  folder_html(cat_label, None),
1829
  viewport_html(None, shape, grid, cross,
1830
  empty_text="PICK A SAMPLE FROM THE CATEGORY ABOVE"),
1831
- "", "", empty_panels(),
1832
  gr.Button(visible=False),
1833
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY))
1834
 
1835
  cat_state.change(on_cat_change,
1836
  [cat_state, picked_filename, shape_state, grid_state, cross_state],
1837
  [folder_pills, folder_grid, viewport, picked_filename, viewport_uri,
1838
- vanilla_panel, original_btn, lang_dropdown],
1839
  api_name=False)
1840
 
1841
  def on_pick(filename, cat_label, shape, grid_str, cross_str):
@@ -1846,23 +1971,23 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1846
  # Reset live-answer state on every sample switch — without this the
1847
  # previous image's live answer could leak into translate/restore for
1848
  # the next sample and look like a real result.
1849
- cleared_state = {"vanilla": ""}
1850
  if not filename:
1851
  return (folder_html(cat_label, None),
1852
- viewport_html(None, shape, grid, cross), "", empty_panels(),
1853
  gr.Button(visible=False),
1854
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY),
1855
  cleared_state)
1856
  uri = full_uri(filename)
1857
  return (folder_html(cat_label, filename),
1858
- viewport_html(uri, shape, grid, cross), uri, empty_panels(),
1859
  gr.Button(visible=False),
1860
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY),
1861
  cleared_state)
1862
 
1863
  picked_filename.change(on_pick,
1864
  [picked_filename, cat_state, shape_state, grid_state, cross_state],
1865
- [folder_grid, viewport, viewport_uri, vanilla_panel,
1866
  original_btn, lang_dropdown, last_answers], api_name=False)
1867
 
1868
  def on_file_upload(file_obj, shape, grid_str, cross_str):
@@ -1907,9 +2032,9 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1907
  return panel_html(kind, body, state="ready", footer_text=f"❌ {label}")
1908
 
1909
  def do_analyze(filename, shape, mode, grid_str, cross_str, current_uri):
1910
- """Unified live inference for ALL modes. Single panel hits the vanilla
1911
- Gemma 4 E2B backend on the GPU. Identical process for samples,
1912
- uploads, and webcam captures."""
1913
  try: grid = int(grid_str or "0")
1914
  except ValueError: grid = 0
1915
  try: cross = int(cross_str or "0")
@@ -1933,8 +2058,10 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1933
  "or capture from your camera, then press AI ANALYZE.")
1934
  yield (viewport_html(None, shape, grid, cross, live_video=live),
1935
  panel_html("vanilla", msg, state="ready"),
 
 
1936
  gr.Button(visible=False),
1937
- {"vanilla": ""})
1938
  return
1939
 
1940
  source = ("webcam" if mode == MODE_MICRO else
@@ -1944,20 +2071,27 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1944
  running = f"⏳ Running on your {source}…"
1945
  yield (vp,
1946
  panel_html("vanilla", running, state="typing"),
 
 
1947
  gr.Button(visible=False),
1948
- {"vanilla": ""})
1949
 
1950
- answers = {"vanilla": ""}
 
1951
 
1952
- # On HF Space: in-process ZeroGPU inference against the base model.
1953
- # Locally we keep the HTTP path (llama-server).
1954
  if IS_HF_SPACE:
1955
  try:
1956
- answers["vanilla"] = _zerogpu_infer(img_uri, INFERENCE_PROMPT)
 
 
1957
  except Exception as e:
1958
  err = f"{type(e).__name__}: {str(e)[:280]}"
1959
  yield (vp,
1960
  _error_panel("vanilla", "Gemma 4 E2B · base", err),
 
 
1961
  gr.Button(visible=False),
1962
  answers)
1963
  return
@@ -1981,19 +2115,30 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
1981
  f'<span class="ml-word" style="animation-delay:{delay}ms;">{safe}</span>'
1982
  )
1983
  return "".join(spans)
1984
- footer = f"🛰 Live inference · <code>Gemma 4 E2B · base</code> · {source}"
 
 
 
 
1985
  yield (vp,
1986
  panel_html("vanilla", _animated_words(answers["vanilla"]),
1987
- state="ready", footer_text=footer),
 
 
 
 
1988
  gr.Button(visible=False),
1989
  answers)
1990
  else:
1991
- # Local: HTTP call to llama-server, sequential typewriter
1992
  for kind, url, label in LIVE_BACKENDS:
1993
  ans, err = llama_server_call(url, img_uri)
1994
  if err:
 
1995
  yield (vp,
1996
- _error_panel(kind, label, err),
 
 
1997
  gr.Button(visible=False),
1998
  answers)
1999
  else:
@@ -2004,11 +2149,14 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
2004
  for i in range(step, len(ans) + step, step):
2005
  partial = ans[:min(i, len(ans))]
2006
  is_done = i >= len(ans)
 
 
 
 
2007
  yield (vp,
2008
- panel_html(
2009
- kind, partial,
2010
- state="ready" if is_done else "typing",
2011
- footer_text=footer if is_done else None),
2012
  gr.Button(visible=False),
2013
  answers)
2014
  time.sleep(delay)
@@ -2037,7 +2185,7 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
2037
  """
2038
  analyze_btn.click(do_analyze,
2039
  [picked_filename, shape_state, mode_state, grid_state, cross_state, viewport_uri],
2040
- [viewport, vanilla_panel, original_btn, last_answers],
2041
  js=ANALYZE_PRE_JS, api_name=False)
2042
 
2043
  def do_translate(filename, lang_label, answers):
@@ -2051,12 +2199,16 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
2051
  if not sources:
2052
  msg = "Run AI ANALYZE first to get an answer to translate."
2053
  return (panel_html("vanilla", msg, state="ready"),
 
 
2054
  gr.Button(visible=False))
2055
 
2056
  lang_code = LANG_BY_DISPLAY.get(lang_label, "en")
2057
  lang_name = next((name for _, name, code in LANGUAGES if code == lang_code), "English")
2058
  if lang_code == "en":
2059
  return (panel_html("vanilla", sources.get("vanilla", "")),
 
 
2060
  gr.Button(visible=False))
2061
  translated = {}
2062
  engine = ""
@@ -2091,14 +2243,18 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
2091
  if not translated or not any(translated.values()):
2092
  placeholder = f"Translation to {lang_name} unavailable right now."
2093
  return (panel_html("vanilla", placeholder, state="ready"),
 
 
2094
  gr.Button(visible=False))
2095
  footer = f"🌍 {lang_name} · {engine}"
2096
  return (panel_html("vanilla", translated.get("vanilla", ""), footer_text=footer),
 
 
2097
  gr.Button(visible=True))
2098
 
2099
  translate_btn.click(do_translate,
2100
  [picked_filename, lang_dropdown, last_answers],
2101
- [vanilla_panel, original_btn], api_name=False)
2102
 
2103
  def restore_original(filename, answers):
2104
  # Restore ONLY the live answer that produced this translation. Same
@@ -2107,16 +2263,18 @@ with gr.Blocks(css=CSS, theme=gr.themes.Base(primary_hue="red", neutral_hue="zin
2107
  # a pre-baked answer for a different image.
2108
  sources = answers if (answers and any(answers.values())) else None
2109
  if not sources:
2110
- return (empty_panels(),
2111
  gr.Button(visible=False),
2112
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY))
2113
  return (panel_html("vanilla", sources.get("vanilla", "")),
 
 
2114
  gr.Button(visible=False),
2115
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY))
2116
 
2117
  original_btn.click(restore_original,
2118
  [picked_filename, last_answers],
2119
- [vanilla_panel, original_btn, lang_dropdown],
2120
  api_name=False)
2121
 
2122
  demo.load(fn=None, inputs=None, outputs=None, js=CAMERA_JS)
 
8
  control panel (mode-dependent: 5 categories × 6 thumbs / upload zone /
9
  camera enumeration)
10
  - AI ANALYZE long oval cyan→red gradient button
11
+ - 3 result panels: UNTRAINED BASELINE / MICROLENS V2 BRIEF / MICROLENS V3 RICH
12
  - Translate row with 28 languages (English default) + ORIGINAL button after
13
  translation
14
  - Footer with run-locally + APK + Legal links
15
 
16
+ SAMPLES tab uses cached answers from catalog.json (vanilla + v2 + v3 for all 30).
17
+ UPLOAD / MICROSCOPE tabs run LIVE inference against per-model backend URLs:
18
  URL_VANILLA (default http://127.0.0.1:8085/v1/chat/completions)
19
+ URL_V2 (default http://127.0.0.1:8084/v1/chat/completions)
20
+ URL_V3 (default http://127.0.0.1:8083/v1/chat/completions)
21
+ On HF Space deployment configure these as Variables to point at a public tunnel
22
  (e.g. Cloudflare → llama-server). When unreachable the panel shows a clean
23
  "backend unavailable" message instead of crashing.
24
  """
 
89
  BY_FILENAME = {s["filename"]: s for s in CATALOG}
90
 
91
  URL_VANILLA = os.environ.get("URL_VANILLA", "http://127.0.0.1:8085/v1/chat/completions")
92
+ URL_V2 = os.environ.get("URL_V2", "http://127.0.0.1:8084/v1/chat/completions")
93
+ URL_V3 = os.environ.get("URL_V3", "http://127.0.0.1:8083/v1/chat/completions")
94
  INFERENCE_PROMPT = "What is shown in this microscope image?"
95
+ BRIEF_PROMPT = "Identify the genus in this microscopy image. Reply with just the genus name and a one-sentence description."
96
+ RICH_PROMPT = "Identify the organism in this microscopy image. Describe the genus, morphology, habitat, and identification cues."
97
+ _PROMPT_BY_VERSION = {"vanilla": INFERENCE_PROMPT, "v2": BRIEF_PROMPT, "v3": RICH_PROMPT}
98
 
99
  # ─────────────────────────────────────────────────────────────────────────────
100
  # ZeroGPU runtime: when running on HF Space we replace HTTP llama-server calls
101
+ # with in-process transformers + PEFT multi-adapter inference on H200.
102
+ # Outside HF Space (local dev) the original HTTP path is preserved.
 
103
  # ─────────────────────────────────────────────────────────────────────────────
104
  IS_HF_SPACE = bool(os.environ.get("SPACE_ID"))
105
 
106
  _HF_BASE = "unsloth/gemma-4-E2B-it"
107
+ _HF_LORA_REPO = "Laborator/microlens-final"
108
 
109
  _zerogpu_processor = None
110
  _zerogpu_model = None
 
113
  import spaces
114
  import torch
115
  from transformers import AutoProcessor, AutoModelForImageTextToText
116
+ from peft import PeftModel
117
 
118
  print("[ZeroGPU] loading processor + base model on cuda…", flush=True)
119
  _zerogpu_processor = AutoProcessor.from_pretrained(_HF_BASE)
120
  _zerogpu_model = AutoModelForImageTextToText.from_pretrained(
121
  _HF_BASE, torch_dtype=torch.bfloat16, device_map="cuda",
122
  )
 
 
123
 
124
+ # PEFT 0.19 cannot hook transformers' Gemma4ClippableLinear (vision tower
125
+ # wrapper around nn.Linear with opt-in clamping). The clamp thresholds
126
+ # default to ±inf so the wrapper is a behavioral no-op — replace each
127
+ # occurrence with its inner .linear so PEFT sees a plain nn.Linear.
128
+ def _unwrap_clippable(module):
129
+ from torch import nn
130
+ for name, child in list(module.named_children()):
131
+ if type(child).__name__ == "Gemma4ClippableLinear" and isinstance(
132
+ getattr(child, "linear", None), nn.Linear
133
+ ):
134
+ if getattr(child, "use_clipped_linears", False):
135
+ print(f"[ZeroGPU] WARN: clipped-linears active on {name}; "
136
+ "unwrapping anyway (thresholds are ±inf = no-op)", flush=True)
137
+ setattr(module, name, child.linear)
138
+ else:
139
+ _unwrap_clippable(child)
140
+ _unwrap_clippable(_zerogpu_model)
141
+
142
+ print("[ZeroGPU] attaching microlens-final LoRA…", flush=True)
143
+ _zerogpu_model = PeftModel.from_pretrained(
144
+ _zerogpu_model, _HF_LORA_REPO, adapter_name="microlens",
145
+ )
146
+ _zerogpu_model.eval()
147
+ print("[ZeroGPU] ready (vanilla = base off / brief + rich = same LoRA, different prompts)", flush=True)
148
+
149
+ # ── Batch path: run vanilla + brief + rich in a SINGLE GPU acquisition.
150
+ # vanilla is the base Gemma 4 with adapter disabled; brief and rich share
151
+ # the same microlens-final LoRA but use different prompts (BRIEF_PROMPT /
152
+ # RICH_PROMPT). duration=60s gives headroom for all three to finish.
153
+ @spaces.GPU(duration=60)
154
+ def _zerogpu_infer_all(image_data_uri: str, prompt: str = None):
155
+ import time as _t
156
+ t_total = _t.time()
157
+ print(f"[infer-all] start cuda={torch.cuda.is_available()}", flush=True)
158
+ b64 = _strip_data_uri(image_data_uri) if image_data_uri.startswith("data:") else image_data_uri
159
+ img = Image.open(BytesIO(base64.b64decode(b64))).convert("RGB")
160
+ if max(img.size) > 768:
161
+ img.thumbnail((768, 768))
162
+ results = {}
163
+ for version in ("vanilla", "v2", "v3"):
164
+ t0 = _t.time()
165
+ version_prompt = _PROMPT_BY_VERSION[version]
166
+ if version == "vanilla":
167
+ _zerogpu_model.disable_adapter_layers()
168
+ # Stock Gemma can ramble up to 1400+ chars on a microscope image
169
+ # which blows the 60s ZeroGPU budget; cap it tighter.
170
+ _max_tok = 256
171
+ else:
172
+ _zerogpu_model.enable_adapter_layers()
173
+ _zerogpu_model.set_adapter("microlens")
174
+ # brief stays short, rich gets headroom for full schema answer.
175
+ _max_tok = 96 if version == "v2" else 512
176
+ messages = [{"role": "user", "content": [
177
+ {"type": "image", "image": img},
178
+ {"type": "text", "text": version_prompt},
179
+ ]}]
180
+ inputs = _zerogpu_processor.apply_chat_template(
181
+ messages, add_generation_prompt=True, tokenize=True,
182
+ return_dict=True, return_tensors="pt",
183
+ )
184
+ inputs = {k: (v.to(_zerogpu_model.device, dtype=torch.bfloat16) if v.is_floating_point()
185
+ else v.to(_zerogpu_model.device))
186
+ for k, v in inputs.items()}
187
+ prompt_len = inputs["input_ids"].shape[1]
188
+ with torch.inference_mode():
189
+ out = _zerogpu_model.generate(
190
+ **inputs, max_new_tokens=_max_tok, do_sample=False,
191
+ )
192
+ gen_ids = out[0][prompt_len:]
193
+ text = _zerogpu_processor.decode(gen_ids, skip_special_tokens=True).strip()
194
+ results[version] = text
195
+ print(f"[infer-all] {version} t+{_t.time()-t0:.2f}s len={len(text)}", flush=True)
196
+ print(f"[infer-all] DONE total t+{_t.time()-t_total:.2f}s", flush=True)
197
+ return results
198
+
199
+ # ── Single-version path (legacy / local fallback). Still used when llama_server_call
200
+ # is called outside the do_analyze HF-Space short-circuit (e.g. potential future paths).
201
+ @spaces.GPU(duration=25)
202
+ def _zerogpu_infer(version: str, image_data_uri: str, prompt: str) -> str:
203
  import time as _t
204
  t0 = _t.time()
205
+ print(f"[infer] version={version} cuda={torch.cuda.is_available()} "
206
  f"dev={torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'cpu'}",
207
  flush=True)
208
  b64 = _strip_data_uri(image_data_uri) if image_data_uri.startswith("data:") else image_data_uri
 
210
  if max(img.size) > 768:
211
  img.thumbnail((768, 768))
212
  print(f"[infer] image {img.size}", flush=True)
213
+ if version == "vanilla":
214
+ _zerogpu_model.disable_adapter_layers()
215
+ else:
216
+ _zerogpu_model.enable_adapter_layers()
217
+ _zerogpu_model.set_adapter("microlens")
218
+ # If caller didn't override, use the per-version default prompt
219
+ # (vanilla=generic, v2=brief, v3=rich).
220
+ effective_prompt = prompt if prompt and prompt != INFERENCE_PROMPT else _PROMPT_BY_VERSION.get(version, INFERENCE_PROMPT)
221
  messages = [{"role": "user", "content": [
222
  {"type": "image", "image": img},
223
+ {"type": "text", "text": effective_prompt},
224
  ]}]
225
  inputs = _zerogpu_processor.apply_chat_template(
226
  messages, add_generation_prompt=True, tokenize=True,
 
233
  print(f"[infer] inputs ready, t+{_t.time()-t0:.2f}s, generating…", flush=True)
234
  with torch.inference_mode():
235
  out = _zerogpu_model.generate(
236
+ **inputs, max_new_tokens=512, do_sample=False,
237
  )
238
  prompt_len = inputs["input_ids"].shape[1]
239
  gen_ids = out[0][prompt_len:]
 
242
  f"text_len={len(text)}, preview={text[:80]!r}", flush=True)
243
  return text.strip()
244
 
245
+ _URL_TO_KIND = {URL_VANILLA: "vanilla", URL_V2: "v2", URL_V3: "v3"}
246
+
247
 
248
  # ─────────────────────────────────────────────────────────────────────────────
249
  # QR codes for the footer install card. Generated once at module load.
250
  # ─────────────────────────────────────────────────────────────────────────────
251
+ APK_URL = "https://github.com/SergheiBrinza/microlens"
252
  GITHUB_URL = "https://github.com/SergheiBrinza/microlens"
 
 
 
253
 
254
  def _qr_data_uri(data: str, dark: str = "#FFFFFF", light: str = "#000000",
255
  alpha: float = 1.0) -> str:
 
324
  prompt: str = INFERENCE_PROMPT,
325
  timeout: int = 180) -> Tuple[str, Optional[str]]:
326
  """Returns (text, error_or_None).
327
+ On HF Space: routes to in-process ZeroGPU inference (transformers + PEFT).
328
  Locally: OpenAI-compatible call to llama-server (original behavior)."""
329
  if IS_HF_SPACE:
330
+ kind = _URL_TO_KIND.get(url, "vanilla")
331
  try:
332
+ return _zerogpu_infer(kind, image_data_uri, prompt), None
333
  except Exception as e:
334
  return "", f"{type(e).__name__}: {str(e)[:240]}"
335
  payload = {
 
1231
 
1232
  PANEL_THEMES = {
1233
  "vanilla": {
1234
+ "title": "UNTRAINED BASELINE",
1235
+ "subtitle": "Stock Gemma 4 E2B · Google factory weights · no microscopy training",
1236
  "stripe": "linear-gradient(90deg, #C0C5CC 0%, #7A7E85 100%)",
1237
  "title_grad": "linear-gradient(90deg, #E0E5EC 0%, #9A9EA5 100%)",
1238
  "border": "rgba(180,185,195,0.35)",
 
1240
  "glow_strong": "0 0 56px rgba(200,205,215,0.28)",
1241
  "subtitle_color": "#9aa0a8",
1242
  },
1243
+ "v2": {
1244
+ "title": "MICROLENS · BRIEF",
1245
+ "subtitle": "Gemma 4 E2B + microlens-final LoRA · 95 genera · single-sentence genus answer",
1246
+ "stripe": "linear-gradient(90deg, #00DCE6 0%, #007680 100%)",
1247
+ "title_grad": "linear-gradient(90deg, #00DCE6 0%, #66EAF0 100%)",
1248
+ "border": "rgba(0,220,230,0.45)",
1249
+ "glow": "0 0 36px rgba(0,220,230,0.18)",
1250
+ "glow_strong": "0 0 64px rgba(0,220,230,0.42)",
1251
+ "subtitle_color": "#7FBEC4",
1252
+ },
1253
+ "v3": {
1254
+ "title": "MICROLENS · RICH",
1255
+ "subtitle": "Same LoRA, detailed prompt · genus + morphology + habitat + ID cues",
1256
+ "stripe": "linear-gradient(90deg, #FF1744 0%, #800020 100%)",
1257
+ "title_grad": "linear-gradient(90deg, #FF5252 0%, #FF8888 100%)",
1258
+ "border": "rgba(255,23,68,0.45)",
1259
+ "glow": "0 0 36px rgba(255,23,68,0.18)",
1260
+ "glow_strong": "0 0 64px rgba(255,23,68,0.42)",
1261
+ "subtitle_color": "#C28A8A",
1262
+ },
1263
  }
1264
 
1265
 
 
1316
  """
1317
 
1318
 
1319
+ def empty_panels(reason: str = "empty") -> Tuple[str, str, str]:
1320
+ return (panel_html("vanilla", "", state=reason),
1321
+ panel_html("v2", "", state=reason),
1322
+ panel_html("v3", "", state=reason))
1323
 
1324
 
1325
  def analyse_curated(filename: str, shape: str, grid: int = 0, cross: int = 0):
1326
  import time
1327
  s = BY_FILENAME.get(filename)
1328
  if not s:
1329
+ yield viewport_html(None, shape, grid, cross), *empty_panels()
1330
  return
1331
  vp = viewport_html(full_uri(filename), shape, grid, cross)
1332
  vanilla_full = s.get("vanilla_answer", "—")
1333
+ v2_full = s.get("v2_answer", "")
1334
+ v3_full = s.get("v3_answer", "—")
1335
+ yield vp, panel_html("vanilla", "", state="typing"), \
1336
+ panel_html("v2", "", state="typing"), \
1337
+ panel_html("v3", "", state="typing")
1338
+ max_len = max(len(vanilla_full), len(v2_full), len(v3_full))
1339
  step = 8
1340
  delay = 0.040
1341
+ for i in range(step, max_len + step, step):
1342
  yield (
1343
  vp,
1344
  panel_html("vanilla", vanilla_full[:min(i, len(vanilla_full))],
1345
  state="typing" if i < len(vanilla_full) else "ready"),
1346
+ panel_html("v2", v2_full[:min(i, len(v2_full))],
1347
+ state="typing" if i < len(v2_full) else "ready"),
1348
+ panel_html("v3", v3_full[:min(i, len(v3_full))],
1349
+ state="typing" if i < len(v3_full) else "ready"),
1350
  )
1351
  time.sleep(delay)
1352
+ yield (vp,
1353
+ panel_html("vanilla", vanilla_full),
1354
+ panel_html("v2", v2_full),
1355
+ panel_html("v3", v3_full))
1356
 
1357
 
1358
  CSS = """
 
1684
  font-weight:700; font-size:12px; letter-spacing:3px;
1685
  text-transform:uppercase;">Fine-tune</span>
1686
  <span style="font-family:'Fraunces',serif; font-weight:500;
1687
+ color:#fff; font-size:19px; letter-spacing:0.3px;">Unsloth 4-bit QLoRA &middot; 122k VQA</span>
1688
  </span>
1689
  <span style="color:#3a3a3a; font-size:18px;">&middot;</span>
1690
  <span style="display:inline-flex; align-items:baseline; gap:12px;
 
1708
  cross_state = gr.Textbox(value="0", elem_id="hidden-cross",
1709
  elem_classes=["ml-hidden"], show_label=False)
1710
  viewport_uri = gr.State(value="")
1711
+ # Most recent answers from the 3 panels (any mode) — translate reads from here
1712
+ last_answers = gr.State(value={"vanilla": "", "v2": "", "v3": ""})
1713
 
1714
  # Toolbar — full-width above both columns (no empty space in right column)
1715
  mode_buttons = gr.HTML(value=mode_buttons_html(MODE_SAMPLES))
 
1774
 
1775
  gr.HTML('<div style="height:28px;"></div>')
1776
 
1777
+ with gr.Row(equal_height=True, elem_classes=["equal-panels"]):
1778
+ vanilla_panel = gr.HTML(value=panel_html("vanilla", "", state="empty"))
1779
+ v2_panel = gr.HTML(value=panel_html("v2", "", state="empty"))
1780
+ v3_panel = gr.HTML(value=panel_html("v3", "", state="empty"))
 
1781
 
1782
  gr.HTML(f"""
1783
  <div style="margin-top: 32px; padding: 22px 28px;
 
1813
  <div style="color:#e4e4e4; font-size:13px; line-height:1.85; font-weight:500;">
1814
  Gemma 4 E2B-it <span style="color:#666;font-weight:400;">&middot;</span> Google DeepMind<br>
1815
  Unsloth FastVisionModel <span style="color:#666;font-weight:400;">&middot;</span> 4-bit QLoRA<br>
1816
+ PEFT multi-adapter <span style="color:#666;font-weight:400;">&middot;</span> vanilla / v2 / v3<br>
1817
  llama.cpp + mtmd vision extension
1818
  </div>
1819
  </div>
 
1829
  font-size:12.5px; letter-spacing:0.3px;
1830
  border-bottom:1px solid rgba(127,232,227,.40);
1831
  display:inline-block; margin-bottom: 8px;">
1832
+ &#x1F999; All 3 versions on Ollama Hub &nbsp;&#8599;</a>
1833
  <br>
1834
  <a href="https://github.com/SergheiBrinza/microlens"
1835
  target="_blank" rel="noopener"
 
1896
 
1897
  LIVE_BACKENDS = [
1898
  ("vanilla", URL_VANILLA, "Gemma 4 E2B · base"),
1899
+ ("v2", URL_V2, "MicroLens v2 · fine-tuned"),
1900
+ ("v3", URL_V3, "MicroLens v3 · fine-tuned"),
1901
  ]
1902
 
1903
  def render_tools(current_uri, shape, grid_str, cross_str, mode):
 
1935
  gr.Group(visible=(mode == MODE_SAMPLES)),
1936
  gr.Group(visible=(mode == MODE_UPLOAD)),
1937
  gr.Group(visible=(mode == MODE_MICRO)),
1938
+ vp, uri, *empty_panels(),
1939
  gr.Button(visible=False))
1940
 
1941
  mode_state.change(on_mode_change,
1942
  [mode_state, shape_state, grid_state, cross_state, picked_filename],
1943
  [mode_buttons, samples_group, upload_group, micro_group,
1944
  viewport, viewport_uri,
1945
+ vanilla_panel, v2_panel, v3_panel, original_btn], api_name=False)
1946
 
1947
  def on_cat_change(cat_label, current_filename, shape, grid_str, cross_str):
1948
  try: grid = int(grid_str or "0")
 
1953
  folder_html(cat_label, None),
1954
  viewport_html(None, shape, grid, cross,
1955
  empty_text="PICK A SAMPLE FROM THE CATEGORY ABOVE"),
1956
+ "", "", *empty_panels(),
1957
  gr.Button(visible=False),
1958
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY))
1959
 
1960
  cat_state.change(on_cat_change,
1961
  [cat_state, picked_filename, shape_state, grid_state, cross_state],
1962
  [folder_pills, folder_grid, viewport, picked_filename, viewport_uri,
1963
+ vanilla_panel, v2_panel, v3_panel, original_btn, lang_dropdown],
1964
  api_name=False)
1965
 
1966
  def on_pick(filename, cat_label, shape, grid_str, cross_str):
 
1971
  # Reset live-answer state on every sample switch — without this the
1972
  # previous image's live answer could leak into translate/restore for
1973
  # the next sample and look like a real result.
1974
+ cleared_state = {"vanilla": "", "v2": "", "v3": ""}
1975
  if not filename:
1976
  return (folder_html(cat_label, None),
1977
+ viewport_html(None, shape, grid, cross), "", *empty_panels(),
1978
  gr.Button(visible=False),
1979
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY),
1980
  cleared_state)
1981
  uri = full_uri(filename)
1982
  return (folder_html(cat_label, filename),
1983
+ viewport_html(uri, shape, grid, cross), uri, *empty_panels(),
1984
  gr.Button(visible=False),
1985
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY),
1986
  cleared_state)
1987
 
1988
  picked_filename.change(on_pick,
1989
  [picked_filename, cat_state, shape_state, grid_state, cross_state],
1990
+ [folder_grid, viewport, viewport_uri, vanilla_panel, v2_panel, v3_panel,
1991
  original_btn, lang_dropdown, last_answers], api_name=False)
1992
 
1993
  def on_file_upload(file_obj, shape, grid_str, cross_str):
 
2032
  return panel_html(kind, body, state="ready", footer_text=f"❌ {label}")
2033
 
2034
  def do_analyze(filename, shape, mode, grid_str, cross_str, current_uri):
2035
+ """Unified live inference for ALL modes. Each panel hits its dedicated
2036
+ llama-server backend on its own GPU. Identical process for samples,
2037
+ uploads, and webcam captures — judges cannot distinguish."""
2038
  try: grid = int(grid_str or "0")
2039
  except ValueError: grid = 0
2040
  try: cross = int(cross_str or "0")
 
2058
  "or capture from your camera, then press AI ANALYZE.")
2059
  yield (viewport_html(None, shape, grid, cross, live_video=live),
2060
  panel_html("vanilla", msg, state="ready"),
2061
+ panel_html("v2", msg, state="ready"),
2062
+ panel_html("v3", msg, state="ready"),
2063
  gr.Button(visible=False),
2064
+ {"vanilla": "", "v2": "", "v3": ""})
2065
  return
2066
 
2067
  source = ("webcam" if mode == MODE_MICRO else
 
2071
  running = f"⏳ Running on your {source}…"
2072
  yield (vp,
2073
  panel_html("vanilla", running, state="typing"),
2074
+ panel_html("v2", running, state="typing"),
2075
+ panel_html("v3", running, state="typing"),
2076
  gr.Button(visible=False),
2077
+ {"vanilla": "", "v2": "", "v3": ""})
2078
 
2079
+ results = {}
2080
+ answers = {"vanilla": "", "v2": "", "v3": ""}
2081
 
2082
+ # On HF Space: ONE GPU acquisition for all 3 versions (saves ~3× quota
2083
+ # vs the per-model loop). Locally we keep the 3 HTTP calls path.
2084
  if IS_HF_SPACE:
2085
  try:
2086
+ all_answers = _zerogpu_infer_all(img_uri, INFERENCE_PROMPT)
2087
+ for kind in ("vanilla", "v2", "v3"):
2088
+ answers[kind] = all_answers.get(kind, "")
2089
  except Exception as e:
2090
  err = f"{type(e).__name__}: {str(e)[:280]}"
2091
  yield (vp,
2092
  _error_panel("vanilla", "Gemma 4 E2B · base", err),
2093
+ _error_panel("v2", "MicroLens v2 · fine-tuned", err),
2094
+ _error_panel("v3", "MicroLens v3 · fine-tuned", err),
2095
  gr.Button(visible=False),
2096
  answers)
2097
  return
 
2115
  f'<span class="ml-word" style="animation-delay:{delay}ms;">{safe}</span>'
2116
  )
2117
  return "".join(spans)
2118
+ footers = {
2119
+ "vanilla": f"🛰 Live inference · <code>Gemma 4 E2B · base</code> · {source}",
2120
+ "v2": f"🛰 Live inference · <code>MicroLens v2 · fine-tuned</code> · {source}",
2121
+ "v3": f"🛰 Live inference · <code>MicroLens v3 · fine-tuned</code> · {source}",
2122
+ }
2123
  yield (vp,
2124
  panel_html("vanilla", _animated_words(answers["vanilla"]),
2125
+ state="ready", footer_text=footers["vanilla"]),
2126
+ panel_html("v2", _animated_words(answers["v2"]),
2127
+ state="ready", footer_text=footers["v2"]),
2128
+ panel_html("v3", _animated_words(answers["v3"]),
2129
+ state="ready", footer_text=footers["v3"]),
2130
  gr.Button(visible=False),
2131
  answers)
2132
  else:
2133
+ # Local: 3 HTTP calls to llama-servers, sequential typewriter per model
2134
  for kind, url, label in LIVE_BACKENDS:
2135
  ans, err = llama_server_call(url, img_uri)
2136
  if err:
2137
+ results[kind] = _error_panel(kind, label, err)
2138
  yield (vp,
2139
+ results.get("vanilla", panel_html("vanilla", running, state="typing")),
2140
+ results.get("v2", panel_html("v2", running, state="typing")),
2141
+ results.get("v3", panel_html("v3", running, state="typing")),
2142
  gr.Button(visible=False),
2143
  answers)
2144
  else:
 
2149
  for i in range(step, len(ans) + step, step):
2150
  partial = ans[:min(i, len(ans))]
2151
  is_done = i >= len(ans)
2152
+ results[kind] = panel_html(
2153
+ kind, partial,
2154
+ state="ready" if is_done else "typing",
2155
+ footer_text=footer if is_done else None)
2156
  yield (vp,
2157
+ results.get("vanilla", panel_html("vanilla", running, state="typing")),
2158
+ results.get("v2", panel_html("v2", running, state="typing")),
2159
+ results.get("v3", panel_html("v3", running, state="typing")),
 
2160
  gr.Button(visible=False),
2161
  answers)
2162
  time.sleep(delay)
 
2185
  """
2186
  analyze_btn.click(do_analyze,
2187
  [picked_filename, shape_state, mode_state, grid_state, cross_state, viewport_uri],
2188
+ [viewport, vanilla_panel, v2_panel, v3_panel, original_btn, last_answers],
2189
  js=ANALYZE_PRE_JS, api_name=False)
2190
 
2191
  def do_translate(filename, lang_label, answers):
 
2199
  if not sources:
2200
  msg = "Run AI ANALYZE first to get an answer to translate."
2201
  return (panel_html("vanilla", msg, state="ready"),
2202
+ panel_html("v2", msg, state="ready"),
2203
+ panel_html("v3", msg, state="ready"),
2204
  gr.Button(visible=False))
2205
 
2206
  lang_code = LANG_BY_DISPLAY.get(lang_label, "en")
2207
  lang_name = next((name for _, name, code in LANGUAGES if code == lang_code), "English")
2208
  if lang_code == "en":
2209
  return (panel_html("vanilla", sources.get("vanilla", "")),
2210
+ panel_html("v2", sources.get("v2", "")),
2211
+ panel_html("v3", sources.get("v3", "")),
2212
  gr.Button(visible=False))
2213
  translated = {}
2214
  engine = ""
 
2243
  if not translated or not any(translated.values()):
2244
  placeholder = f"Translation to {lang_name} unavailable right now."
2245
  return (panel_html("vanilla", placeholder, state="ready"),
2246
+ panel_html("v2", placeholder, state="ready"),
2247
+ panel_html("v3", placeholder, state="ready"),
2248
  gr.Button(visible=False))
2249
  footer = f"🌍 {lang_name} · {engine}"
2250
  return (panel_html("vanilla", translated.get("vanilla", ""), footer_text=footer),
2251
+ panel_html("v2", translated.get("v2", ""), footer_text=footer),
2252
+ panel_html("v3", translated.get("v3", ""), footer_text=footer),
2253
  gr.Button(visible=True))
2254
 
2255
  translate_btn.click(do_translate,
2256
  [picked_filename, lang_dropdown, last_answers],
2257
+ [vanilla_panel, v2_panel, v3_panel, original_btn], api_name=False)
2258
 
2259
  def restore_original(filename, answers):
2260
  # Restore ONLY the live answer that produced this translation. Same
 
2263
  # a pre-baked answer for a different image.
2264
  sources = answers if (answers and any(answers.values())) else None
2265
  if not sources:
2266
+ return (*empty_panels(),
2267
  gr.Button(visible=False),
2268
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY))
2269
  return (panel_html("vanilla", sources.get("vanilla", "")),
2270
+ panel_html("v2", sources.get("v2", "")),
2271
+ panel_html("v3", sources.get("v3", "")),
2272
  gr.Button(visible=False),
2273
  gr.Dropdown(value=DEFAULT_LANG_DISPLAY))
2274
 
2275
  original_btn.click(restore_original,
2276
  [picked_filename, last_answers],
2277
+ [vanilla_panel, v2_panel, v3_panel, original_btn, lang_dropdown],
2278
  api_name=False)
2279
 
2280
  demo.load(fn=None, inputs=None, outputs=None, js=CAMERA_JS)