Phase B (live): Boltz-2 via Modal sidecar instead of ZeroGPU
Browse files- Add modal_boltz_app.py: A10G companion app deployed to Modal, exposes
POST /predict with FastAPI; runs `boltz predict` on demand and returns
pLDDT/pTM/ipTM/i_pAE per item. Image: torch 2.10 + boltz 2.2.1 +
cuequivariance 0.9 + fastapi[standard]. Auto-stops after 5min idle.
- Rewrite eval_boltz.py as an HTTP client of the Modal endpoint.
Reads MODAL_BOLTZ_URL and MODAL_BOLTZ_TOKEN from Space secrets;
graceful fallback when unset.
- requirements.txt: drop torch/boltz/spaces (no longer needed in the
HF Space image -- prediction runs on Modal).
- README: describe the Modal sidecar architecture and deployment.
- Smoke-tested end to end with ubiquitin: pLDDT 93.89, pTM 0.9194.
- README.md +29 -15
- eval_boltz.py +110 -180
- modal_boltz_app.py +270 -0
- requirements.txt +4 -9
README.md
CHANGED
|
@@ -46,26 +46,40 @@ Submission processing runs in 4 admin-controlled phases:
|
|
| 46 |
| Phase | Step | Status | Notes |
|
| 47 |
|---|---|---|---|
|
| 48 |
| **A** | Dispatch tasks → CPU scoring | live | HTTP POST to submitter endpoint, validate, score 5/6 components |
|
| 49 |
-
| **B** | Boltz-2 structure verification |
|
| 50 |
| **C** | LLM judge panel (28-pt hybrid) | live | 3-judge PoLL with self-exclusion, requires API key secrets |
|
| 51 |
| **D** | Finalize + publish to leaderboard | live | Aggregates hybrid scores, writes back to submissions dataset |
|
| 52 |
|
| 53 |
-
### Phase B
|
| 54 |
|
| 55 |
-
|
|
|
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`,
|
| 66 |
-
`DEEPSEEK_API_KEY`.
|
| 67 |
-
4. Restart the Space. The first build will pull ~2GB of CUDA wheels.
|
| 68 |
|
| 69 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
failure dict with `success=False` and an actionable error message
|
| 71 |
instead of crashing the dispatcher.
|
|
|
|
| 46 |
| Phase | Step | Status | Notes |
|
| 47 |
|---|---|---|---|
|
| 48 |
| **A** | Dispatch tasks → CPU scoring | live | HTTP POST to submitter endpoint, validate, score 5/6 components |
|
| 49 |
+
| **B** | Boltz-2 structure verification | live (Modal) | Modal-hosted A10G companion app provisions GPU on demand |
|
| 50 |
| **C** | LLM judge panel (28-pt hybrid) | live | 3-judge PoLL with self-exclusion, requires API key secrets |
|
| 51 |
| **D** | Finalize + publish to leaderboard | live | Aggregates hybrid scores, writes back to submissions dataset |
|
| 52 |
|
| 53 |
+
### Phase B architecture (Modal companion app)
|
| 54 |
|
| 55 |
+
The HF Space runs on `cpu-basic` and cannot host Boltz directly, so
|
| 56 |
+
Phase B uses a Modal-deployed sidecar (`modal_boltz_app.py`) that:
|
| 57 |
|
| 58 |
+
- pre-builds an image with `boltz==2.2.1`, `torch==2.10`, NVIDIA
|
| 59 |
+
cuequivariance kernels, and FastAPI;
|
| 60 |
+
- exposes a single web endpoint at
|
| 61 |
+
`https://<workspace>--bdb-boltz-predict.modal.run`;
|
| 62 |
+
- spins up an A10G on demand, runs `boltz predict` (via the same CLI
|
| 63 |
+
the dev pipeline uses), and returns confidence metrics;
|
| 64 |
+
- auto-stops after 5 minutes idle so the lab is only billed for active
|
| 65 |
+
inference time (~$0.06 per task at A10G rates).
|
|
|
|
|
|
|
|
|
|
| 66 |
|
| 67 |
+
The HF Space is just an HTTP client (`eval_boltz.py`); design sequences
|
| 68 |
+
are POSTed to the Modal endpoint with a shared bearer token. To
|
| 69 |
+
deploy the sidecar (one time):
|
| 70 |
+
|
| 71 |
+
```bash
|
| 72 |
+
cd biodesignbench-leaderboard
|
| 73 |
+
modal deploy modal_boltz_app.py
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
Then set these HF Space secrets:
|
| 77 |
+
|
| 78 |
+
```
|
| 79 |
+
MODAL_BOLTZ_URL https://<workspace>--bdb-boltz-predict.modal.run
|
| 80 |
+
MODAL_BOLTZ_TOKEN matches the modal secret `bdb-boltz-shared` TOKEN
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
If `MODAL_BOLTZ_URL` is unset, Phase B predictors return a structured
|
| 84 |
failure dict with `success=False` and an actionable error message
|
| 85 |
instead of crashing the dispatcher.
|
eval_boltz.py
CHANGED
|
@@ -1,221 +1,168 @@
|
|
| 1 |
-
"""Boltz structure
|
| 2 |
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
-
Two prediction modes:
|
| 6 |
-
- Monomer
|
| 7 |
-
- Complex
|
| 8 |
|
| 9 |
-
|
|
|
|
|
|
|
| 10 |
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
3. HF_TOKEN secret set on the Space (for the private hidden-tasks dataset).
|
| 15 |
-
On a cpu-basic Space the predictors return a structured failure dict
|
| 16 |
-
with `success=False` and an actionable error message rather than
|
| 17 |
-
crashing the dispatcher.
|
| 18 |
"""
|
| 19 |
|
| 20 |
from __future__ import annotations
|
| 21 |
|
| 22 |
import logging
|
| 23 |
-
import
|
| 24 |
from typing import Any
|
| 25 |
|
| 26 |
logger = logging.getLogger(__name__)
|
| 27 |
|
| 28 |
-
#
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
|
|
|
| 32 |
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
_BOLTZ_NOT_INSTALLED = (
|
| 40 |
-
"Boltz / torch not available on this Space. To enable Phase B, "
|
| 41 |
-
"switch the Space hardware to ZeroGPU (zero-a10g) and uncomment the "
|
| 42 |
-
"torch + boltz lines in requirements.txt."
|
| 43 |
)
|
| 44 |
|
| 45 |
|
| 46 |
-
def
|
| 47 |
-
"""
|
| 48 |
|
| 49 |
-
Returns:
|
| 50 |
-
Dict with: pLDDT, pTM (or a structured failure dict).
|
| 51 |
-
"""
|
| 52 |
-
try:
|
| 53 |
-
import torch # noqa: F401
|
| 54 |
-
from boltz import Boltz
|
| 55 |
-
except ImportError:
|
| 56 |
-
logger.warning(_BOLTZ_NOT_INSTALLED)
|
| 57 |
-
return {
|
| 58 |
-
"pLDDT": 0.0, "pTM": 0.0,
|
| 59 |
-
"success": False, "error": _BOLTZ_NOT_INSTALLED,
|
| 60 |
-
}
|
| 61 |
-
try:
|
| 62 |
-
model = Boltz.from_pretrained("boltz2")
|
| 63 |
-
result = model.predict(sequence)
|
| 64 |
|
| 65 |
-
|
| 66 |
-
|
| 67 |
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
}
|
| 73 |
-
|
| 74 |
-
logger.error(f"Boltz monomer prediction failed: {e}")
|
| 75 |
-
return {"pLDDT": 0.0, "pTM": 0.0, "success": False, "error": str(e)}
|
| 76 |
|
| 77 |
|
| 78 |
-
def
|
| 79 |
-
|
| 80 |
-
target_seq: str,
|
| 81 |
-
) -> dict[str, float]:
|
| 82 |
-
"""Predict complex structure and binding metrics using Boltz.
|
| 83 |
|
| 84 |
-
Returns
|
| 85 |
-
|
| 86 |
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
try:
|
| 88 |
-
import
|
| 89 |
-
from boltz import Boltz
|
| 90 |
except ImportError:
|
| 91 |
-
logger.warning(_BOLTZ_NOT_INSTALLED)
|
| 92 |
return {
|
| 93 |
-
"
|
| 94 |
-
|
| 95 |
}
|
| 96 |
-
try:
|
| 97 |
-
model = Boltz.from_pretrained("boltz2")
|
| 98 |
-
result = model.predict([binder_seq, target_seq])
|
| 99 |
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
iptm = float(result.confidence.iptm) if hasattr(result.confidence, "iptm") else 0.0
|
| 103 |
-
ipae = float(result.confidence.ipae) if hasattr(result.confidence, "ipae") else 0.0
|
| 104 |
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
"i_pAE": round(ipae, 2),
|
| 110 |
-
"success": True,
|
| 111 |
-
}
|
| 112 |
except Exception as e:
|
| 113 |
-
|
|
|
|
|
|
|
| 114 |
return {
|
| 115 |
-
"
|
| 116 |
-
|
| 117 |
}
|
| 118 |
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
results.append(_predict_complex(binder, target))
|
| 159 |
-
return results
|
| 160 |
-
|
| 161 |
-
except ImportError:
|
| 162 |
-
# Not running on HF Spaces -- provide un-decorated versions
|
| 163 |
-
def predict_monomer_batch(sequences: list[str]) -> list[dict[str, float]]:
|
| 164 |
-
return [_predict_monomer(seq) for seq in sequences[:MONOMER_CHUNK_SIZE]]
|
| 165 |
-
|
| 166 |
-
def predict_complex_batch(
|
| 167 |
-
pairs: list[tuple[str, str]],
|
| 168 |
-
) -> list[dict[str, float]]:
|
| 169 |
-
return [_predict_complex(b, t) for b, t in pairs[:COMPLEX_CHUNK_SIZE]]
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
# ---------------------------------------------------------------------------
|
| 173 |
-
# High-level assessment API
|
| 174 |
-
# ---------------------------------------------------------------------------
|
| 175 |
|
| 176 |
|
| 177 |
def run_boltz_posteval(
|
| 178 |
per_task_results: dict[str, dict[str, Any]],
|
| 179 |
progress_callback=None,
|
| 180 |
) -> dict[str, dict[str, Any]]:
|
| 181 |
-
"""Run Boltz post-assessment on
|
| 182 |
|
| 183 |
-
For each task:
|
| 184 |
-
- Non-binding: pick
|
| 185 |
-
- Binding: pick
|
| 186 |
- Merge Boltz metrics into existing results
|
| 187 |
-
- Re-score quality component
|
| 188 |
-
|
| 189 |
-
Args:
|
| 190 |
-
per_task_results: Dict of task_id -> dispatch result (from dispatcher).
|
| 191 |
-
progress_callback: Optional callback(task_id, i, total, metrics).
|
| 192 |
-
|
| 193 |
-
Returns:
|
| 194 |
-
Updated per_task_results with Boltz metrics and final quality scores.
|
| 195 |
"""
|
| 196 |
-
from eval_scorer import _is_binding_task
|
| 197 |
|
| 198 |
-
|
| 199 |
-
|
| 200 |
-
complex_tasks = []
|
| 201 |
|
| 202 |
for task_id, result in per_task_results.items():
|
| 203 |
if not result.get("success") or not result.get("quality_pending"):
|
| 204 |
continue
|
| 205 |
-
|
| 206 |
sequences = result.get("sequences", [])
|
| 207 |
if not sequences:
|
| 208 |
continue
|
| 209 |
-
|
| 210 |
-
best_seq = sequences[0] # Use first design for Boltz
|
| 211 |
|
| 212 |
if _is_binding_task(task_id):
|
| 213 |
-
|
| 214 |
-
|
|
|
|
| 215 |
if target_seq:
|
| 216 |
complex_tasks.append((task_id, best_seq, target_seq))
|
| 217 |
else:
|
| 218 |
-
# Fall back to monomer if no target
|
| 219 |
monomer_tasks.append((task_id, best_seq))
|
| 220 |
else:
|
| 221 |
monomer_tasks.append((task_id, best_seq))
|
|
@@ -223,32 +170,24 @@ def run_boltz_posteval(
|
|
| 223 |
total = len(monomer_tasks) + len(complex_tasks)
|
| 224 |
done = 0
|
| 225 |
|
| 226 |
-
# Process monomer tasks in chunks
|
| 227 |
for chunk_start in range(0, len(monomer_tasks), MONOMER_CHUNK_SIZE):
|
| 228 |
chunk = monomer_tasks[chunk_start:chunk_start + MONOMER_CHUNK_SIZE]
|
| 229 |
seqs = [seq for _, seq in chunk]
|
| 230 |
-
|
| 231 |
boltz_results = predict_monomer_batch(seqs)
|
| 232 |
-
|
| 233 |
for (task_id, _), metrics in zip(chunk, boltz_results):
|
| 234 |
if metrics.get("success"):
|
| 235 |
_merge_boltz_metrics(per_task_results[task_id], metrics)
|
| 236 |
-
|
| 237 |
done += 1
|
| 238 |
if progress_callback:
|
| 239 |
progress_callback(task_id, done, total, metrics)
|
| 240 |
|
| 241 |
-
# Process complex tasks in chunks
|
| 242 |
for chunk_start in range(0, len(complex_tasks), COMPLEX_CHUNK_SIZE):
|
| 243 |
chunk = complex_tasks[chunk_start:chunk_start + COMPLEX_CHUNK_SIZE]
|
| 244 |
pairs = [(binder, target) for _, binder, target in chunk]
|
| 245 |
-
|
| 246 |
boltz_results = predict_complex_batch(pairs)
|
| 247 |
-
|
| 248 |
for (task_id, _, _), metrics in zip(chunk, boltz_results):
|
| 249 |
if metrics.get("success"):
|
| 250 |
_merge_boltz_metrics(per_task_results[task_id], metrics)
|
| 251 |
-
|
| 252 |
done += 1
|
| 253 |
if progress_callback:
|
| 254 |
progress_callback(task_id, done, total, metrics)
|
|
@@ -258,21 +197,16 @@ def run_boltz_posteval(
|
|
| 258 |
|
| 259 |
def _merge_boltz_metrics(
|
| 260 |
task_result: dict[str, Any],
|
| 261 |
-
boltz_metrics: dict[str,
|
| 262 |
) -> None:
|
| 263 |
-
"""Merge Boltz prediction metrics into a task result and re-score quality.
|
| 264 |
-
|
| 265 |
-
Modifies task_result in-place.
|
| 266 |
-
"""
|
| 267 |
from eval_scorer import apply_design_gate, score_quality
|
| 268 |
|
| 269 |
-
# Merge Boltz metrics with any agent-reported metrics
|
| 270 |
merged_metrics = task_result.get("agent_metrics", {}).copy()
|
| 271 |
for key in ("pLDDT", "pTM", "ipTM", "i_pAE"):
|
| 272 |
if key in boltz_metrics and boltz_metrics[key] > 0:
|
| 273 |
merged_metrics[key] = boltz_metrics[key]
|
| 274 |
|
| 275 |
-
# Re-score quality with Boltz metrics
|
| 276 |
quality_result = score_quality(
|
| 277 |
agent_metrics=merged_metrics,
|
| 278 |
thresholds=task_result.get("ground_truth_thresholds", {}),
|
|
@@ -281,15 +215,11 @@ def _merge_boltz_metrics(
|
|
| 281 |
oracle_sequences=task_result.get("oracle_sequences"),
|
| 282 |
)
|
| 283 |
|
| 284 |
-
# Update scores
|
| 285 |
task_result["boltz_metrics"] = boltz_metrics
|
| 286 |
task_result["quality_pending"] = False
|
| 287 |
|
| 288 |
if "cpu_scores" in task_result:
|
| 289 |
task_result["cpu_scores"]["quality"] = quality_result["score"]
|
| 290 |
-
|
| 291 |
-
# Compute final gated score
|
| 292 |
-
if "cpu_scores" in task_result:
|
| 293 |
component_scores = dict(task_result["cpu_scores"])
|
| 294 |
gated = apply_design_gate(component_scores, task_result.get("num_designs", 0))
|
| 295 |
task_result["final_scores"] = gated
|
|
|
|
| 1 |
+
"""Boltz-2 structure verification client (Phase B).
|
| 2 |
|
| 3 |
+
The HF Space leaderboard runs on cpu-basic, so it cannot host Boltz
|
| 4 |
+
directly. This module is a thin HTTP client that POSTs design sequences
|
| 5 |
+
to a Modal-deployed companion app (`modal_boltz_app.py`), which
|
| 6 |
+
provisions an A10G on demand, runs `boltz predict`, and returns
|
| 7 |
+
confidence metrics.
|
| 8 |
|
| 9 |
+
Two prediction modes (selected automatically by `run_boltz_posteval`):
|
| 10 |
+
- Monomer (non-binding tasks) -> pLDDT, pTM
|
| 11 |
+
- Complex (binding tasks) -> pLDDT, pTM, ipTM, i_pAE
|
| 12 |
|
| 13 |
+
Required HF Space secrets (set out-of-band via the leaderboard admin):
|
| 14 |
+
MODAL_BOLTZ_URL https://<workspace>--bdb-boltz-predict.modal.run
|
| 15 |
+
MODAL_BOLTZ_TOKEN shared bearer token matching the modal secret TOKEN
|
| 16 |
|
| 17 |
+
If `MODAL_BOLTZ_URL` is unset the predictors return a structured
|
| 18 |
+
failure dict with `success=False` and an actionable error message
|
| 19 |
+
rather than crashing the dispatcher.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
"""
|
| 21 |
|
| 22 |
from __future__ import annotations
|
| 23 |
|
| 24 |
import logging
|
| 25 |
+
import os
|
| 26 |
from typing import Any
|
| 27 |
|
| 28 |
logger = logging.getLogger(__name__)
|
| 29 |
|
| 30 |
+
# Batch sizes large enough to amortize Modal cold-start, small enough
|
| 31 |
+
# to stay under the 1700s function timeout.
|
| 32 |
+
MONOMER_CHUNK_SIZE = 20
|
| 33 |
+
COMPLEX_CHUNK_SIZE = 10
|
| 34 |
+
HTTP_TIMEOUT_SEC = 1700
|
| 35 |
|
| 36 |
|
| 37 |
+
_NOT_CONFIGURED = (
|
| 38 |
+
"Modal Boltz endpoint not configured. Set MODAL_BOLTZ_URL (and "
|
| 39 |
+
"MODAL_BOLTZ_TOKEN) on the HF Space, or deploy the companion app "
|
| 40 |
+
"with `modal deploy modal_boltz_app.py`."
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
)
|
| 42 |
|
| 43 |
|
| 44 |
+
def _modal_url() -> str | None:
|
| 45 |
+
return os.environ.get("MODAL_BOLTZ_URL", "").strip() or None
|
| 46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
+
def _modal_token() -> str:
|
| 49 |
+
return os.environ.get("MODAL_BOLTZ_TOKEN", "").strip()
|
| 50 |
|
| 51 |
+
|
| 52 |
+
def _failure(error: str, complex_keys: bool = False) -> dict[str, Any]:
|
| 53 |
+
out = {"pLDDT": 0.0, "pTM": 0.0, "success": False, "error": error}
|
| 54 |
+
if complex_keys:
|
| 55 |
+
out.update({"ipTM": 0.0, "i_pAE": 0.0})
|
| 56 |
+
return out
|
|
|
|
|
|
|
| 57 |
|
| 58 |
|
| 59 |
+
def _post_predictions(items: list[dict[str, Any]]) -> dict[str, dict[str, Any]]:
|
| 60 |
+
"""POST a list of prediction items to the Modal endpoint.
|
|
|
|
|
|
|
|
|
|
| 61 |
|
| 62 |
+
Returns a dict mapping each item's `name` to a metric dict, with
|
| 63 |
+
structured failure entries on error.
|
| 64 |
"""
|
| 65 |
+
url = _modal_url()
|
| 66 |
+
if not url:
|
| 67 |
+
return {item["name"]: _failure(_NOT_CONFIGURED) for item in items}
|
| 68 |
+
|
| 69 |
try:
|
| 70 |
+
import httpx
|
|
|
|
| 71 |
except ImportError:
|
|
|
|
| 72 |
return {
|
| 73 |
+
item["name"]: _failure("httpx not installed in leaderboard image")
|
| 74 |
+
for item in items
|
| 75 |
}
|
|
|
|
|
|
|
|
|
|
| 76 |
|
| 77 |
+
headers = {"Content-Type": "application/json"}
|
| 78 |
+
payload = {"token": _modal_token(), "items": items}
|
|
|
|
|
|
|
| 79 |
|
| 80 |
+
try:
|
| 81 |
+
resp = httpx.post(
|
| 82 |
+
url, json=payload, headers=headers, timeout=HTTP_TIMEOUT_SEC,
|
| 83 |
+
)
|
|
|
|
|
|
|
|
|
|
| 84 |
except Exception as e:
|
| 85 |
+
return {item["name"]: _failure(f"Modal POST failed: {e}") for item in items}
|
| 86 |
+
|
| 87 |
+
if resp.status_code != 200:
|
| 88 |
return {
|
| 89 |
+
item["name"]: _failure(f"Modal HTTP {resp.status_code}: {resp.text[:200]}")
|
| 90 |
+
for item in items
|
| 91 |
}
|
| 92 |
|
| 93 |
+
try:
|
| 94 |
+
body = resp.json()
|
| 95 |
+
except Exception as e:
|
| 96 |
+
return {item["name"]: _failure(f"Modal returned non-JSON: {e}") for item in items}
|
| 97 |
+
|
| 98 |
+
if "error" in body:
|
| 99 |
+
msg = body["error"]
|
| 100 |
+
return {item["name"]: _failure(f"Modal: {msg}") for item in items}
|
| 101 |
+
|
| 102 |
+
results = body.get("results", {})
|
| 103 |
+
out: dict[str, dict[str, Any]] = {}
|
| 104 |
+
for item in items:
|
| 105 |
+
name = item["name"]
|
| 106 |
+
out[name] = results.get(name) or _failure(
|
| 107 |
+
"Modal returned no result for this item"
|
| 108 |
+
)
|
| 109 |
+
return out
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
def predict_monomer_batch(sequences: list[str]) -> list[dict[str, float]]:
|
| 113 |
+
"""Predict structures for a batch of monomer sequences."""
|
| 114 |
+
items = [
|
| 115 |
+
{"name": f"mono_{i}", "kind": "monomer", "sequences": [seq]}
|
| 116 |
+
for i, seq in enumerate(sequences[:MONOMER_CHUNK_SIZE])
|
| 117 |
+
]
|
| 118 |
+
by_name = _post_predictions(items)
|
| 119 |
+
return [by_name[item["name"]] for item in items]
|
| 120 |
+
|
| 121 |
+
|
| 122 |
+
def predict_complex_batch(
|
| 123 |
+
pairs: list[tuple[str, str]],
|
| 124 |
+
) -> list[dict[str, float]]:
|
| 125 |
+
"""Predict structures for a batch of (binder, target) pairs."""
|
| 126 |
+
items = [
|
| 127 |
+
{"name": f"cmplx_{i}", "kind": "complex", "sequences": [b, t]}
|
| 128 |
+
for i, (b, t) in enumerate(pairs[:COMPLEX_CHUNK_SIZE])
|
| 129 |
+
]
|
| 130 |
+
by_name = _post_predictions(items)
|
| 131 |
+
return [by_name[item["name"]] for item in items]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
|
| 133 |
|
| 134 |
def run_boltz_posteval(
|
| 135 |
per_task_results: dict[str, dict[str, Any]],
|
| 136 |
progress_callback=None,
|
| 137 |
) -> dict[str, dict[str, Any]]:
|
| 138 |
+
"""Run Boltz post-assessment on every task that needs it.
|
| 139 |
|
| 140 |
+
For each successful task:
|
| 141 |
+
- Non-binding: pick the first design -> monomer prediction
|
| 142 |
+
- Binding: pick the first design + target sequence -> complex prediction
|
| 143 |
- Merge Boltz metrics into existing results
|
| 144 |
+
- Re-score the quality component
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
"""
|
| 146 |
+
from eval_scorer import _is_binding_task
|
| 147 |
|
| 148 |
+
monomer_tasks: list[tuple[str, str]] = []
|
| 149 |
+
complex_tasks: list[tuple[str, str, str]] = []
|
|
|
|
| 150 |
|
| 151 |
for task_id, result in per_task_results.items():
|
| 152 |
if not result.get("success") or not result.get("quality_pending"):
|
| 153 |
continue
|
|
|
|
| 154 |
sequences = result.get("sequences", [])
|
| 155 |
if not sequences:
|
| 156 |
continue
|
| 157 |
+
best_seq = sequences[0]
|
|
|
|
| 158 |
|
| 159 |
if _is_binding_task(task_id):
|
| 160 |
+
target_seq = (
|
| 161 |
+
result.get("ground_truth_thresholds", {}).get("target_sequence")
|
| 162 |
+
)
|
| 163 |
if target_seq:
|
| 164 |
complex_tasks.append((task_id, best_seq, target_seq))
|
| 165 |
else:
|
|
|
|
| 166 |
monomer_tasks.append((task_id, best_seq))
|
| 167 |
else:
|
| 168 |
monomer_tasks.append((task_id, best_seq))
|
|
|
|
| 170 |
total = len(monomer_tasks) + len(complex_tasks)
|
| 171 |
done = 0
|
| 172 |
|
|
|
|
| 173 |
for chunk_start in range(0, len(monomer_tasks), MONOMER_CHUNK_SIZE):
|
| 174 |
chunk = monomer_tasks[chunk_start:chunk_start + MONOMER_CHUNK_SIZE]
|
| 175 |
seqs = [seq for _, seq in chunk]
|
|
|
|
| 176 |
boltz_results = predict_monomer_batch(seqs)
|
|
|
|
| 177 |
for (task_id, _), metrics in zip(chunk, boltz_results):
|
| 178 |
if metrics.get("success"):
|
| 179 |
_merge_boltz_metrics(per_task_results[task_id], metrics)
|
|
|
|
| 180 |
done += 1
|
| 181 |
if progress_callback:
|
| 182 |
progress_callback(task_id, done, total, metrics)
|
| 183 |
|
|
|
|
| 184 |
for chunk_start in range(0, len(complex_tasks), COMPLEX_CHUNK_SIZE):
|
| 185 |
chunk = complex_tasks[chunk_start:chunk_start + COMPLEX_CHUNK_SIZE]
|
| 186 |
pairs = [(binder, target) for _, binder, target in chunk]
|
|
|
|
| 187 |
boltz_results = predict_complex_batch(pairs)
|
|
|
|
| 188 |
for (task_id, _, _), metrics in zip(chunk, boltz_results):
|
| 189 |
if metrics.get("success"):
|
| 190 |
_merge_boltz_metrics(per_task_results[task_id], metrics)
|
|
|
|
| 191 |
done += 1
|
| 192 |
if progress_callback:
|
| 193 |
progress_callback(task_id, done, total, metrics)
|
|
|
|
| 197 |
|
| 198 |
def _merge_boltz_metrics(
|
| 199 |
task_result: dict[str, Any],
|
| 200 |
+
boltz_metrics: dict[str, Any],
|
| 201 |
) -> None:
|
| 202 |
+
"""Merge Boltz prediction metrics into a task result and re-score quality."""
|
|
|
|
|
|
|
|
|
|
| 203 |
from eval_scorer import apply_design_gate, score_quality
|
| 204 |
|
|
|
|
| 205 |
merged_metrics = task_result.get("agent_metrics", {}).copy()
|
| 206 |
for key in ("pLDDT", "pTM", "ipTM", "i_pAE"):
|
| 207 |
if key in boltz_metrics and boltz_metrics[key] > 0:
|
| 208 |
merged_metrics[key] = boltz_metrics[key]
|
| 209 |
|
|
|
|
| 210 |
quality_result = score_quality(
|
| 211 |
agent_metrics=merged_metrics,
|
| 212 |
thresholds=task_result.get("ground_truth_thresholds", {}),
|
|
|
|
| 215 |
oracle_sequences=task_result.get("oracle_sequences"),
|
| 216 |
)
|
| 217 |
|
|
|
|
| 218 |
task_result["boltz_metrics"] = boltz_metrics
|
| 219 |
task_result["quality_pending"] = False
|
| 220 |
|
| 221 |
if "cpu_scores" in task_result:
|
| 222 |
task_result["cpu_scores"]["quality"] = quality_result["score"]
|
|
|
|
|
|
|
|
|
|
| 223 |
component_scores = dict(task_result["cpu_scores"])
|
| 224 |
gated = apply_design_gate(component_scores, task_result.get("num_designs", 0))
|
| 225 |
task_result["final_scores"] = gated
|
modal_boltz_app.py
ADDED
|
@@ -0,0 +1,270 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Modal app: Boltz-2 structure prediction for BioDesignBench Phase B.
|
| 2 |
+
|
| 3 |
+
This is the GPU-side companion to `eval_boltz.py`. The HF Space leaderboard
|
| 4 |
+
runs on cpu-basic, so it cannot host Boltz directly; instead it POSTs design
|
| 5 |
+
sequences to this Modal app, which spins up an A10G on demand, runs
|
| 6 |
+
`boltz predict`, and returns confidence metrics.
|
| 7 |
+
|
| 8 |
+
Setup (one-time, on a machine with `pip install modal`):
|
| 9 |
+
|
| 10 |
+
modal token new # if you don't have a token yet
|
| 11 |
+
cd biodesignbench-leaderboard
|
| 12 |
+
modal deploy modal_boltz_app.py
|
| 13 |
+
|
| 14 |
+
After deploy Modal prints a URL like
|
| 15 |
+
https://<workspace>--bdb-boltz-predict.modal.run
|
| 16 |
+
|
| 17 |
+
Add that URL plus a shared secret to the HF Space secrets:
|
| 18 |
+
MODAL_BOLTZ_URL = https://<workspace>--bdb-boltz-predict.modal.run
|
| 19 |
+
MODAL_BOLTZ_TOKEN = <random 32-byte hex>
|
| 20 |
+
|
| 21 |
+
Cost: A10G is billed per-second, container auto-stops after
|
| 22 |
+
`container_idle_timeout` seconds. With one submission per month and
|
| 23 |
+
~76 tasks * ~30s = ~38min GPU per submission, expected spend is
|
| 24 |
+
well within Modal's free tier.
|
| 25 |
+
"""
|
| 26 |
+
|
| 27 |
+
from __future__ import annotations
|
| 28 |
+
|
| 29 |
+
import os
|
| 30 |
+
|
| 31 |
+
import modal
|
| 32 |
+
|
| 33 |
+
APP_NAME = "bdb-boltz"
|
| 34 |
+
ENDPOINT_LABEL = "bdb-boltz-predict"
|
| 35 |
+
|
| 36 |
+
app = modal.App(APP_NAME)
|
| 37 |
+
|
| 38 |
+
# Persistent volume for Boltz-2 model weights (~6GB, downloaded on first call)
|
| 39 |
+
weights_volume = modal.Volume.from_name(
|
| 40 |
+
"bdb-boltz-weights", create_if_missing=True
|
| 41 |
+
)
|
| 42 |
+
|
| 43 |
+
# Boltz GPU image. Boltz-2 is published on PyPI as `boltz` and pulls a
|
| 44 |
+
# CUDA-12 torch wheel automatically.
|
| 45 |
+
gpu_image = (
|
| 46 |
+
modal.Image.from_registry(
|
| 47 |
+
"nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04",
|
| 48 |
+
add_python="3.11",
|
| 49 |
+
)
|
| 50 |
+
.apt_install("git", "wget", "build-essential")
|
| 51 |
+
# Boltz-2 (>=2.2) uses NVIDIA cuequivariance for the triangular-multiply
|
| 52 |
+
# kernel and requires CUDA 12.5+. We let pip pick a torch that matches
|
| 53 |
+
# cuequivariance's nvidia-cublas-cu12>=12.5 constraint.
|
| 54 |
+
.pip_install(
|
| 55 |
+
# Match dev's known-working stack: torch 2.10 ships nvidia-cublas-cu12
|
| 56 |
+
# 12.8 which satisfies cuequivariance>=12.5 requirement.
|
| 57 |
+
"torch==2.10.0",
|
| 58 |
+
"boltz==2.2.1",
|
| 59 |
+
"cuequivariance==0.9.0",
|
| 60 |
+
"cuequivariance-torch==0.9.0",
|
| 61 |
+
"cuequivariance-ops-cu12==0.9.0",
|
| 62 |
+
"cuequivariance-ops-torch-cu12==0.9.0",
|
| 63 |
+
"fastapi[standard]",
|
| 64 |
+
"pyyaml",
|
| 65 |
+
"numpy",
|
| 66 |
+
)
|
| 67 |
+
.env(
|
| 68 |
+
{
|
| 69 |
+
"BOLTZ_CACHE": "/weights",
|
| 70 |
+
"TORCH_HOME": "/weights/torch",
|
| 71 |
+
"HF_HOME": "/weights/hf",
|
| 72 |
+
}
|
| 73 |
+
)
|
| 74 |
+
)
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
# ---------------------------------------------------------------------------
|
| 78 |
+
# Internal: write YAMLs, run boltz predict, parse outputs
|
| 79 |
+
# ---------------------------------------------------------------------------
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
def _write_yaml(item: dict) -> str:
|
| 83 |
+
"""Render one prediction item to a Boltz YAML string.
|
| 84 |
+
|
| 85 |
+
item shape:
|
| 86 |
+
{"name": "task_001",
|
| 87 |
+
"kind": "monomer" | "complex",
|
| 88 |
+
"sequences": ["MKKL...", ...]} # 1 for monomer, 2 for complex
|
| 89 |
+
"""
|
| 90 |
+
seqs = item.get("sequences") or []
|
| 91 |
+
chain_ids = ["A", "B", "C", "D", "E"]
|
| 92 |
+
lines = ["sequences:"]
|
| 93 |
+
for i, seq in enumerate(seqs):
|
| 94 |
+
cid = chain_ids[i] if i < len(chain_ids) else f"X{i}"
|
| 95 |
+
lines.append(" - protein:")
|
| 96 |
+
lines.append(f" id: {cid}")
|
| 97 |
+
lines.append(f" sequence: {seq}")
|
| 98 |
+
return "\n".join(lines) + "\n"
|
| 99 |
+
|
| 100 |
+
|
| 101 |
+
def _parse_confidence(pred_dir) -> dict:
|
| 102 |
+
"""Parse a Boltz prediction directory into a flat metric dict."""
|
| 103 |
+
import json
|
| 104 |
+
from pathlib import Path
|
| 105 |
+
|
| 106 |
+
import numpy as np
|
| 107 |
+
|
| 108 |
+
out = {
|
| 109 |
+
"pLDDT": 0.0, "pTM": 0.0, "ipTM": 0.0, "i_pAE": 0.0,
|
| 110 |
+
"success": False,
|
| 111 |
+
}
|
| 112 |
+
pred_dir = Path(pred_dir)
|
| 113 |
+
|
| 114 |
+
conf_files = list(pred_dir.rglob("confidence*.json"))
|
| 115 |
+
if conf_files:
|
| 116 |
+
try:
|
| 117 |
+
with open(conf_files[0]) as f:
|
| 118 |
+
c = json.load(f)
|
| 119 |
+
out["pLDDT"] = round(float(c.get("complex_plddt", 0.0)) * 100, 2)
|
| 120 |
+
out["pTM"] = round(float(c.get("ptm", 0.0)), 4)
|
| 121 |
+
out["ipTM"] = round(float(c.get("iptm", 0.0)), 4)
|
| 122 |
+
out["i_pAE"] = round(float(c.get("complex_ipae", 0.0)), 2)
|
| 123 |
+
out["success"] = True
|
| 124 |
+
except Exception:
|
| 125 |
+
pass
|
| 126 |
+
|
| 127 |
+
if not out["success"]:
|
| 128 |
+
# Fall back to per-residue plddt npz if confidence.json is missing
|
| 129 |
+
plddt_files = list(pred_dir.rglob("plddt*.npz"))
|
| 130 |
+
if plddt_files:
|
| 131 |
+
try:
|
| 132 |
+
arr = np.load(plddt_files[0])["plddt"]
|
| 133 |
+
out["pLDDT"] = round(float(arr.mean()) * 100, 2)
|
| 134 |
+
out["success"] = True
|
| 135 |
+
except Exception:
|
| 136 |
+
pass
|
| 137 |
+
|
| 138 |
+
return out
|
| 139 |
+
|
| 140 |
+
|
| 141 |
+
# ---------------------------------------------------------------------------
|
| 142 |
+
# GPU entry point — single web endpoint handling both monomer and complex
|
| 143 |
+
# ---------------------------------------------------------------------------
|
| 144 |
+
|
| 145 |
+
|
| 146 |
+
@app.function(
|
| 147 |
+
image=gpu_image,
|
| 148 |
+
gpu="A10G",
|
| 149 |
+
volumes={"/weights": weights_volume},
|
| 150 |
+
timeout=1800,
|
| 151 |
+
scaledown_window=300,
|
| 152 |
+
secrets=[modal.Secret.from_name("bdb-boltz-shared", required_keys=["TOKEN"])],
|
| 153 |
+
)
|
| 154 |
+
@modal.fastapi_endpoint(method="POST", label=ENDPOINT_LABEL)
|
| 155 |
+
def predict(payload: dict) -> dict:
|
| 156 |
+
"""Run Boltz-2 on a list of prediction items.
|
| 157 |
+
|
| 158 |
+
Body shape:
|
| 159 |
+
{"token": "<shared secret>",
|
| 160 |
+
"items": [{"name": "...", "kind": "monomer"|"complex",
|
| 161 |
+
"sequences": [...]}, ...]}
|
| 162 |
+
|
| 163 |
+
The list is assembled into a single ``boltz predict`` invocation so
|
| 164 |
+
the model loads only once per call (amortizes ~30s cold start).
|
| 165 |
+
|
| 166 |
+
Returns a dict mapping each item's `name` to a metric dict:
|
| 167 |
+
{"pLDDT", "pTM", "ipTM", "i_pAE", "success"}
|
| 168 |
+
"""
|
| 169 |
+
import shutil
|
| 170 |
+
import subprocess
|
| 171 |
+
import tempfile
|
| 172 |
+
from pathlib import Path
|
| 173 |
+
|
| 174 |
+
expected_token = os.environ.get("TOKEN", "")
|
| 175 |
+
if expected_token and (payload.get("token") or "") != expected_token:
|
| 176 |
+
return {"error": "Unauthorized -- bad MODAL_BOLTZ_TOKEN"}
|
| 177 |
+
|
| 178 |
+
items = payload.get("items") or []
|
| 179 |
+
if not items:
|
| 180 |
+
return {"results": {}}
|
| 181 |
+
|
| 182 |
+
work = Path(tempfile.mkdtemp(prefix="bdb_boltz_"))
|
| 183 |
+
in_dir = work / "inputs"
|
| 184 |
+
out_dir = work / "out"
|
| 185 |
+
in_dir.mkdir()
|
| 186 |
+
out_dir.mkdir()
|
| 187 |
+
|
| 188 |
+
name_to_yaml: dict[str, str] = {}
|
| 189 |
+
for i, item in enumerate(items):
|
| 190 |
+
name = str(item.get("name") or f"item_{i:04d}")
|
| 191 |
+
safe = "".join(c if c.isalnum() else "_" for c in name)[:60]
|
| 192 |
+
yaml_name = f"{i:04d}_{safe}"
|
| 193 |
+
(in_dir / f"{yaml_name}.yaml").write_text(_write_yaml(item))
|
| 194 |
+
name_to_yaml[name] = yaml_name
|
| 195 |
+
|
| 196 |
+
cmd = [
|
| 197 |
+
"boltz", "predict",
|
| 198 |
+
str(in_dir),
|
| 199 |
+
"--out_dir", str(out_dir),
|
| 200 |
+
"--cache", "/weights/boltz_cache",
|
| 201 |
+
"--diffusion_samples", "1",
|
| 202 |
+
"--output_format", "pdb",
|
| 203 |
+
"--use_msa_server",
|
| 204 |
+
]
|
| 205 |
+
|
| 206 |
+
proc = subprocess.run(
|
| 207 |
+
cmd, capture_output=True, text=True, timeout=1700, cwd=str(work),
|
| 208 |
+
)
|
| 209 |
+
|
| 210 |
+
# Persist downloaded model weights to the shared volume
|
| 211 |
+
try:
|
| 212 |
+
weights_volume.commit()
|
| 213 |
+
except Exception:
|
| 214 |
+
pass
|
| 215 |
+
|
| 216 |
+
if proc.returncode != 0:
|
| 217 |
+
shutil.rmtree(str(work), ignore_errors=True)
|
| 218 |
+
return {
|
| 219 |
+
"error": "boltz predict failed",
|
| 220 |
+
"stderr": proc.stderr[-2000:],
|
| 221 |
+
"stdout": proc.stdout[-2000:],
|
| 222 |
+
}
|
| 223 |
+
|
| 224 |
+
# boltz writes outputs to out/boltz_results_inputs/predictions/<name>/
|
| 225 |
+
predictions_root = None
|
| 226 |
+
for p in out_dir.rglob("predictions"):
|
| 227 |
+
if p.is_dir():
|
| 228 |
+
predictions_root = p
|
| 229 |
+
break
|
| 230 |
+
|
| 231 |
+
results: dict[str, dict] = {}
|
| 232 |
+
if predictions_root is not None:
|
| 233 |
+
for name, yaml_name in name_to_yaml.items():
|
| 234 |
+
pred_dirs = [
|
| 235 |
+
d for d in predictions_root.iterdir()
|
| 236 |
+
if d.is_dir() and (d.name.startswith(yaml_name) or d.name == yaml_name)
|
| 237 |
+
]
|
| 238 |
+
if pred_dirs:
|
| 239 |
+
results[name] = _parse_confidence(pred_dirs[0])
|
| 240 |
+
else:
|
| 241 |
+
results[name] = {
|
| 242 |
+
"pLDDT": 0.0, "pTM": 0.0, "ipTM": 0.0, "i_pAE": 0.0,
|
| 243 |
+
"success": False, "error": "prediction missing",
|
| 244 |
+
}
|
| 245 |
+
|
| 246 |
+
shutil.rmtree(str(work), ignore_errors=True)
|
| 247 |
+
return {"results": results}
|
| 248 |
+
|
| 249 |
+
|
| 250 |
+
# ---------------------------------------------------------------------------
|
| 251 |
+
# CLI smoke test: modal run modal_boltz_app.py
|
| 252 |
+
# ---------------------------------------------------------------------------
|
| 253 |
+
|
| 254 |
+
|
| 255 |
+
@app.local_entrypoint()
|
| 256 |
+
def main():
|
| 257 |
+
"""Quick sanity check — a short ubiquitin-like sequence."""
|
| 258 |
+
import json
|
| 259 |
+
|
| 260 |
+
items = [
|
| 261 |
+
{
|
| 262 |
+
"name": "monomer_demo",
|
| 263 |
+
"kind": "monomer",
|
| 264 |
+
"sequences": [
|
| 265 |
+
"MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG"
|
| 266 |
+
],
|
| 267 |
+
},
|
| 268 |
+
]
|
| 269 |
+
out = predict.remote(items, authorization="")
|
| 270 |
+
print(json.dumps(out, indent=2))
|
requirements.txt
CHANGED
|
@@ -10,12 +10,7 @@ anthropic>=0.75
|
|
| 10 |
openai>=1.40
|
| 11 |
google-genai>=0.3
|
| 12 |
|
| 13 |
-
# Phase B
|
| 14 |
-
#
|
| 15 |
-
#
|
| 16 |
-
#
|
| 17 |
-
# Space hardware to a GPU tier (zero-a10g recommended) — otherwise pip
|
| 18 |
-
# will pull ~2GB of CUDA wheels onto a CPU image and the build fails.
|
| 19 |
-
spaces>=0.30
|
| 20 |
-
# torch>=2.2 # ZeroGPU only — uncomment after hardware flip
|
| 21 |
-
# boltz>=0.4 # ZeroGPU only — uncomment after hardware flip
|
|
|
|
| 10 |
openai>=1.40
|
| 11 |
google-genai>=0.3
|
| 12 |
|
| 13 |
+
# Phase B uses a Modal-hosted Boltz sidecar (modal_boltz_app.py), so
|
| 14 |
+
# torch / boltz are NOT installed in the Space image; the Space only
|
| 15 |
+
# acts as an HTTP client of the Modal endpoint. See
|
| 16 |
+
# biodesignbench-leaderboard/README.md for deployment notes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|