File size: 5,549 Bytes
ccefb84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3b148fc
ccefb84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
---
library_name: transformers
tags: []
---


# NOTE
The GitHub with the implementation and requirements can be found [here](https://github.com/Synthyra/FastPLMs.git).

# Boltz2 AutoModel (Inference-only)
This is a barebones Huggingface `AutoModel` compatible implementation of Boltz2 focused on fast inference workflows.

The implementation is located in `fastplms/boltz/` and exposes:
- `Boltz2Config`
- `Boltz2Model`
- `predict_structure(amino_acid_sequence, ...)`
- `save_as_cif(structure_output, output_path, ...)`

## Design goals
- Inference-only (no training hooks, no Lightning trainer usage).
- Lightweight runtime around `torch` + `transformers` (plus `numpy`).
- AutoModel remote-code compatibility via `trust_remote_code=True`.
- Confidence outputs included in prediction outputs (`plddt`, `ptm`, `iptm`, and derived confidence score when available).

## Runtime note
This implementation is self-contained inside `fastplms/boltz/` and does not require
the original cloned `boltz` package at runtime.

## Use with transformers

### Load from an exported directory
```python

import torch

from transformers import AutoModel



model = AutoModel.from_pretrained(

    "Synthyra/Boltz2",

    trust_remote_code=True,

    dtype=torch.float32,

).eval()

```

### Predict structure from sequence
```python

out = model.predict_structure(

    amino_acid_sequence="MSTNPKPQRKTKRNTNRRPQDVKFPGG",

    recycling_steps=3,

    num_sampling_steps=200,

    diffusion_samples=1,

)



print(out.sample_atom_coords.shape)

print(None if out.plddt is None else out.plddt.shape)

```

### Save CIF
```python

model.save_as_cif(out, "prediction.cif")

```

## Convert Boltz checkpoint to HF export
Use:

```bash

py -m fastplms.boltz.get_weights --checkpoint_path fastplms/boltz/weights/boltz2_conf.ckpt --output_dir boltz2_automodel_export

```

The export directory contains:
- `config.json`
- `pytorch_model.bin`
- `modeling_boltz2.py`
- `minimal_featurizer.py`
- `minimal_structures.py`
- `cif_writer.py`
- `vb_*.py` (self-contained vendored Boltz2 inference modules/constants)

## Output object fields
`predict_structure(...)` returns `Boltz2StructureOutput` with:
- `sample_atom_coords`
- `atom_pad_mask`
- `plddt`
- `complex_plddt`
- `ptm`
- `iptm`
- `confidence_score` (derived when available)
- `raw_output`

## Limitations
- Current featurization path is protein-only and minimal.
- This implementation is meant for practical inference and export workflows, not full Boltz training parity.
- Test-time training is not supported for Boltz2 in FastPLMs. TTT is currently limited to sequence PLMs plus ESMFold and ESMFold2 PLM backbones.

## Docker-first compliance testing

Build the container at repo root:

```bash

docker build -t fastplms-test -f Dockerfile .

```

Launch a test shell:

```bash

docker run --rm --gpus all -it -v ${PWD}:/workspace fastplms-test bash

```

Inside the container, run Boltz2 compliance against pip `boltz`:

```bash

python -m testing.run_boltz2_compliance --device cuda --dtype float32 --seed 42 --num-sequences 3 --recycling-steps 3 --num-sampling-steps 200 --diffusion-samples 1 --pass-coord-metric aligned --write-cif-artifacts

```

Artifacts are written to `testing/results/<timestamp>/boltz2_compliance/` by default:
- `metrics.json`
- `metrics.csv`
- `summary.txt`
- `structures/seq_<idx>/ours_seq<idx>.cif`
- `structures/seq_<idx>/ref_seq<idx>.cif`

Coordinate metrics now include both raw and rigid-aligned variants:
- `coord_mae`, `coord_rmse`, `coord_max_abs` (raw frame-dependent deltas)
- `coord_mae_aligned`, `coord_rmse_aligned`, `coord_max_abs_aligned` (Kabsch aligned)
- `pairwise_dist_mae` (frame-invariant pairwise-distance delta)

Pass/fail uses `--pass-coord-metric aligned` by default. Set `--pass-coord-metric raw` to use the raw coordinate thresholds.

## Citations

```bibtex

@misc{FastPLMs,

  author={Hallee, Logan and Bichara, David and Gleghorn, Jason P.},

  title={FastPLMs: Fast, efficient, protein language model inference from Huggingface AutoModel.},

  year={2024},

  url={https://huggingface.co/Synthyra/ESMplusplus_small},

  DOI={10.57967/hf/3726},

  publisher={Hugging Face}

}

```

```bibtex

@article{passaro2025boltz2,

  title={Boltz-2: Exploring the Frontiers of Biomolecular Prediction},

  author={Passaro, Saro and Corso, Gabriele and Wohlwend, Jeremy and Reveiz, Mateo and Bordes, Florian and Wicky, Basile and Dayan, Peter and Jing, Bowen},

  journal={bioRxiv},

  year={2025}

}

```

```bibtex

@article{wohlwend2024boltz1,

  title={Boltz-1: Democratizing Biomolecular Interaction Modeling},

  author={Wohlwend, Jeremy and Corso, Gabriele and Passaro, Saro and Reveiz, Mateo and Leidal, Ken and Swanson, Wojtek and Kher, Gilmer and Lember, Tommi and Jaakkola, Tommi},

  journal={bioRxiv},

  year={2024}

}

```

```bibtex

@inproceedings{paszke2019pytorch,

  title={PyTorch: An Imperative Style, High-Performance Deep Learning Library},

  author={Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and K{\"o}pf, Andreas and Yang, Edward and DeVito, Zach and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu and Bai, Junjie and Chintala, Soumith},

  booktitle={Advances in Neural Information Processing Systems 32},

  year={2019}

}

```