Spaces:

RomeroLab-Duke
/

BioDesignBench-Leaderboard

Running

App Files Files Community

Jasonkim8652 commited on Apr 15

Commit

7fd8751

verified ·

1 Parent(s): 8e08ed6

Phase B: Boltz post-eval scaffold (graceful cpu-basic fallback)

Browse files

- Add `spaces>=0.30` to requirements (no-op decorator on cpu-basic)
- Gate torch/boltz behind comments in requirements (uncomment when flipping to ZeroGPU)
- eval_boltz._predict_*: split ImportError from runtime errors with actionable message
- HF_TOKEN secret set out-of-band for hidden-tasks dataset access
- README documents Phase B activation checklist + 4-phase pipeline status table

Files changed (3) hide show

README.md +31 -0
eval_boltz.py +33 -6
requirements.txt +10 -0

README.md CHANGED Viewed

@@ -38,3 +38,34 @@ Novelty, and Diversity. See the *About* tab for the full methodology and the
 - **Guidance Effect** — Paired comparison of the same LLM in unguided (atomic tools) vs guided (composite workflows) mode
 - **Depth Gap** — Forced-depth and low-diversity intervention results
 - **About** — Methodology, submission guide, and citation info

 - **Guidance Effect** — Paired comparison of the same LLM in unguided (atomic tools) vs guided (composite workflows) mode
 - **Depth Gap** — Forced-depth and low-diversity intervention results
 - **About** — Methodology, submission guide, and citation info
+## Backend pipeline phases
+Submission processing runs in 4 admin-controlled phases:
+| Phase | Step | Status | Notes |
+|---|---|---|---|
+| **A** | Dispatch tasks → CPU scoring | live | HTTP POST to submitter endpoint, validate, score 5/6 components |
+| **B** | Boltz-2 structure verification | code-ready | Needs ZeroGPU hardware + uncommented `torch`/`boltz` deps |
+| **C** | LLM judge panel (28-pt hybrid) | live | 3-judge PoLL with self-exclusion, requires API key secrets |
+| **D** | Finalize + publish to leaderboard | live | Aggregates hybrid scores, writes back to submissions dataset |
+### Phase B activation checklist
+To wire up Boltz-2 verification on this Space:
+1. **Switch hardware** in HF Space settings → Hardware → `zero-a10g`
+   (requires HF Pro / Enterprise).
+2. **Edit `requirements.txt`** and uncomment the two lines:
+   ```
+   torch>=2.2
+   boltz>=0.4
+   ```
+3. **Verify secrets** are set: `HF_TOKEN` (private dataset),
+   `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`,
+   `DEEPSEEK_API_KEY`.
+4. Restart the Space. The first build will pull ~2GB of CUDA wheels.
+On `cpu-basic` hardware the Phase B predictors return a structured
+failure dict with `success=False` and an actionable error message
+instead of crashing the dispatcher.

eval_boltz.py CHANGED Viewed

@@ -7,6 +7,14 @@ Two prediction modes:
   - Complex: Binding tasks (binder + target) -> ipTM, i_pAE
 Batch chunking respects ZeroGPU time limits (~180-240s per burst).
 """
 from __future__ import annotations
@@ -28,16 +36,29 @@ MAX_GPU_TIME = 240         # safety margin under 300s ZeroGPU limit
 # ---------------------------------------------------------------------------
 def _predict_monomer(sequence: str) -> dict[str, float]:
     """Predict structure of a single protein sequence using Boltz.
     Returns:
-        Dict with: pLDDT, pTM (or error).
     """
     try:
-        import torch
         from boltz import Boltz
         model = Boltz.from_pretrained("boltz2")
         result = model.predict(sequence)
@@ -61,12 +82,18 @@ def _predict_complex(
     """Predict complex structure and binding metrics using Boltz.
     Returns:
-        Dict with: ipTM, i_pAE, pLDDT, pTM (or error).
     """
     try:
-        import torch
         from boltz import Boltz
         model = Boltz.from_pretrained("boltz2")
         result = model.predict([binder_seq, target_seq])

   - Complex: Binding tasks (binder + target) -> ipTM, i_pAE
 Batch chunking respects ZeroGPU time limits (~180-240s per burst).
+Phase B activation checklist (must all be true to actually run Boltz):
+  1. HF Space hardware switched to a GPU tier (zero-a10g recommended).
+  2. requirements.txt has `torch` and `boltz` uncommented.
+  3. HF_TOKEN secret set on the Space (for the private hidden-tasks dataset).
+On a cpu-basic Space the predictors return a structured failure dict
+with `success=False` and an actionable error message rather than
+crashing the dispatcher.
 """
 from __future__ import annotations
 # ---------------------------------------------------------------------------
+_BOLTZ_NOT_INSTALLED = (
+    "Boltz / torch not available on this Space. To enable Phase B, "
+    "switch the Space hardware to ZeroGPU (zero-a10g) and uncomment the "
+    "torch + boltz lines in requirements.txt."
+)
 def _predict_monomer(sequence: str) -> dict[str, float]:
     """Predict structure of a single protein sequence using Boltz.
     Returns:
+        Dict with: pLDDT, pTM (or a structured failure dict).
     """
     try:
+        import torch  # noqa: F401
         from boltz import Boltz
+    except ImportError:
+        logger.warning(_BOLTZ_NOT_INSTALLED)
+        return {
+            "pLDDT": 0.0, "pTM": 0.0,
+            "success": False, "error": _BOLTZ_NOT_INSTALLED,
+        }
+    try:
         model = Boltz.from_pretrained("boltz2")
         result = model.predict(sequence)
     """Predict complex structure and binding metrics using Boltz.
     Returns:
+        Dict with: ipTM, i_pAE, pLDDT, pTM (or a structured failure dict).
     """
     try:
+        import torch  # noqa: F401
         from boltz import Boltz
+    except ImportError:
+        logger.warning(_BOLTZ_NOT_INSTALLED)
+        return {
+            "pLDDT": 0.0, "pTM": 0.0, "ipTM": 0.0, "i_pAE": 0.0,
+            "success": False, "error": _BOLTZ_NOT_INSTALLED,
+        }
+    try:
         model = Boltz.from_pretrained("boltz2")
         result = model.predict([binder_seq, target_seq])

requirements.txt CHANGED Viewed

@@ -9,3 +9,13 @@ datasets>=2.16
 anthropic>=0.75
 openai>=1.40
 google-genai>=0.3

 anthropic>=0.75
 openai>=1.40
 google-genai>=0.3
+# Phase B (Boltz post-eval). The `spaces` shim is safe on any hardware
+# tier; the `@spaces.GPU(...)` decorator is a no-op on cpu-basic and
+# provisions ZeroGPU on zero-a10g. Boltz-1 + torch require an actual
+# CUDA build, so they are gated: uncomment ONLY after switching the
+# Space hardware to a GPU tier (zero-a10g recommended) — otherwise pip
+# will pull ~2GB of CUDA wheels onto a CPU image and the build fails.
+spaces>=0.30
+# torch>=2.2          # ZeroGPU only — uncomment after hardware flip
+# boltz>=0.4          # ZeroGPU only — uncomment after hardware flip