Jasonkim8652 commited on
Commit
7fd8751
·
verified ·
1 Parent(s): 8e08ed6

Phase B: Boltz post-eval scaffold (graceful cpu-basic fallback)

Browse files

- Add `spaces>=0.30` to requirements (no-op decorator on cpu-basic)
- Gate torch/boltz behind comments in requirements (uncomment when flipping to ZeroGPU)
- eval_boltz._predict_*: split ImportError from runtime errors with actionable message
- HF_TOKEN secret set out-of-band for hidden-tasks dataset access
- README documents Phase B activation checklist + 4-phase pipeline status table

Files changed (3) hide show
  1. README.md +31 -0
  2. eval_boltz.py +33 -6
  3. requirements.txt +10 -0
README.md CHANGED
@@ -38,3 +38,34 @@ Novelty, and Diversity. See the *About* tab for the full methodology and the
38
  - **Guidance Effect** — Paired comparison of the same LLM in unguided (atomic tools) vs guided (composite workflows) mode
39
  - **Depth Gap** — Forced-depth and low-diversity intervention results
40
  - **About** — Methodology, submission guide, and citation info
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  - **Guidance Effect** — Paired comparison of the same LLM in unguided (atomic tools) vs guided (composite workflows) mode
39
  - **Depth Gap** — Forced-depth and low-diversity intervention results
40
  - **About** — Methodology, submission guide, and citation info
41
+
42
+ ## Backend pipeline phases
43
+
44
+ Submission processing runs in 4 admin-controlled phases:
45
+
46
+ | Phase | Step | Status | Notes |
47
+ |---|---|---|---|
48
+ | **A** | Dispatch tasks → CPU scoring | live | HTTP POST to submitter endpoint, validate, score 5/6 components |
49
+ | **B** | Boltz-2 structure verification | code-ready | Needs ZeroGPU hardware + uncommented `torch`/`boltz` deps |
50
+ | **C** | LLM judge panel (28-pt hybrid) | live | 3-judge PoLL with self-exclusion, requires API key secrets |
51
+ | **D** | Finalize + publish to leaderboard | live | Aggregates hybrid scores, writes back to submissions dataset |
52
+
53
+ ### Phase B activation checklist
54
+
55
+ To wire up Boltz-2 verification on this Space:
56
+
57
+ 1. **Switch hardware** in HF Space settings → Hardware → `zero-a10g`
58
+ (requires HF Pro / Enterprise).
59
+ 2. **Edit `requirements.txt`** and uncomment the two lines:
60
+ ```
61
+ torch>=2.2
62
+ boltz>=0.4
63
+ ```
64
+ 3. **Verify secrets** are set: `HF_TOKEN` (private dataset),
65
+ `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`,
66
+ `DEEPSEEK_API_KEY`.
67
+ 4. Restart the Space. The first build will pull ~2GB of CUDA wheels.
68
+
69
+ On `cpu-basic` hardware the Phase B predictors return a structured
70
+ failure dict with `success=False` and an actionable error message
71
+ instead of crashing the dispatcher.
eval_boltz.py CHANGED
@@ -7,6 +7,14 @@ Two prediction modes:
7
  - Complex: Binding tasks (binder + target) -> ipTM, i_pAE
8
 
9
  Batch chunking respects ZeroGPU time limits (~180-240s per burst).
 
 
 
 
 
 
 
 
10
  """
11
 
12
  from __future__ import annotations
@@ -28,16 +36,29 @@ MAX_GPU_TIME = 240 # safety margin under 300s ZeroGPU limit
28
  # ---------------------------------------------------------------------------
29
 
30
 
 
 
 
 
 
 
 
31
  def _predict_monomer(sequence: str) -> dict[str, float]:
32
  """Predict structure of a single protein sequence using Boltz.
33
 
34
  Returns:
35
- Dict with: pLDDT, pTM (or error).
36
  """
37
  try:
38
- import torch
39
  from boltz import Boltz
40
-
 
 
 
 
 
 
41
  model = Boltz.from_pretrained("boltz2")
42
  result = model.predict(sequence)
43
 
@@ -61,12 +82,18 @@ def _predict_complex(
61
  """Predict complex structure and binding metrics using Boltz.
62
 
63
  Returns:
64
- Dict with: ipTM, i_pAE, pLDDT, pTM (or error).
65
  """
66
  try:
67
- import torch
68
  from boltz import Boltz
69
-
 
 
 
 
 
 
70
  model = Boltz.from_pretrained("boltz2")
71
  result = model.predict([binder_seq, target_seq])
72
 
 
7
  - Complex: Binding tasks (binder + target) -> ipTM, i_pAE
8
 
9
  Batch chunking respects ZeroGPU time limits (~180-240s per burst).
10
+
11
+ Phase B activation checklist (must all be true to actually run Boltz):
12
+ 1. HF Space hardware switched to a GPU tier (zero-a10g recommended).
13
+ 2. requirements.txt has `torch` and `boltz` uncommented.
14
+ 3. HF_TOKEN secret set on the Space (for the private hidden-tasks dataset).
15
+ On a cpu-basic Space the predictors return a structured failure dict
16
+ with `success=False` and an actionable error message rather than
17
+ crashing the dispatcher.
18
  """
19
 
20
  from __future__ import annotations
 
36
  # ---------------------------------------------------------------------------
37
 
38
 
39
+ _BOLTZ_NOT_INSTALLED = (
40
+ "Boltz / torch not available on this Space. To enable Phase B, "
41
+ "switch the Space hardware to ZeroGPU (zero-a10g) and uncomment the "
42
+ "torch + boltz lines in requirements.txt."
43
+ )
44
+
45
+
46
  def _predict_monomer(sequence: str) -> dict[str, float]:
47
  """Predict structure of a single protein sequence using Boltz.
48
 
49
  Returns:
50
+ Dict with: pLDDT, pTM (or a structured failure dict).
51
  """
52
  try:
53
+ import torch # noqa: F401
54
  from boltz import Boltz
55
+ except ImportError:
56
+ logger.warning(_BOLTZ_NOT_INSTALLED)
57
+ return {
58
+ "pLDDT": 0.0, "pTM": 0.0,
59
+ "success": False, "error": _BOLTZ_NOT_INSTALLED,
60
+ }
61
+ try:
62
  model = Boltz.from_pretrained("boltz2")
63
  result = model.predict(sequence)
64
 
 
82
  """Predict complex structure and binding metrics using Boltz.
83
 
84
  Returns:
85
+ Dict with: ipTM, i_pAE, pLDDT, pTM (or a structured failure dict).
86
  """
87
  try:
88
+ import torch # noqa: F401
89
  from boltz import Boltz
90
+ except ImportError:
91
+ logger.warning(_BOLTZ_NOT_INSTALLED)
92
+ return {
93
+ "pLDDT": 0.0, "pTM": 0.0, "ipTM": 0.0, "i_pAE": 0.0,
94
+ "success": False, "error": _BOLTZ_NOT_INSTALLED,
95
+ }
96
+ try:
97
  model = Boltz.from_pretrained("boltz2")
98
  result = model.predict([binder_seq, target_seq])
99
 
requirements.txt CHANGED
@@ -9,3 +9,13 @@ datasets>=2.16
9
  anthropic>=0.75
10
  openai>=1.40
11
  google-genai>=0.3
 
 
 
 
 
 
 
 
 
 
 
9
  anthropic>=0.75
10
  openai>=1.40
11
  google-genai>=0.3
12
+
13
+ # Phase B (Boltz post-eval). The `spaces` shim is safe on any hardware
14
+ # tier; the `@spaces.GPU(...)` decorator is a no-op on cpu-basic and
15
+ # provisions ZeroGPU on zero-a10g. Boltz-1 + torch require an actual
16
+ # CUDA build, so they are gated: uncomment ONLY after switching the
17
+ # Space hardware to a GPU tier (zero-a10g recommended) — otherwise pip
18
+ # will pull ~2GB of CUDA wheels onto a CPU image and the build fails.
19
+ spaces>=0.30
20
+ # torch>=2.2 # ZeroGPU only — uncomment after hardware flip
21
+ # boltz>=0.4 # ZeroGPU only — uncomment after hardware flip