File size: 4,967 Bytes
a376829
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# Protein Structure Prediction with Diffusers

A [diffusers](https://github.com/huggingface/diffusers) `ModularPipeline` wrapper for [RosettaFold3](https://doi.org/10.1101/2025.08.14.670328) (RF3) — a diffusion-based protein structure prediction model that predicts 3D atomic coordinates from amino acid sequences.

RF3 relies on [Foundry](https://github.com/RosettaCommons/foundry) for its underlying implementation and [AtomWorks](https://github.com/RosettaCommons/atomworks) for structure I/O. This package adds only the thin wrappers needed for diffusers integration.

## Getting Started

### Installation

```bash
pip install rc-foundry[all]
pip install diffusers
```

### Running with Diffusers

```python
import torch
from diffusers import ModularPipeline

pipe = ModularPipeline.from_pretrained("dn6/RosettaFold-3", trust_remote_code=True)
pipe.load_components(device_map="cuda", torch_dtype=torch.bfloat16, trust_remote_code=True)

state = pipe(sequence="MKVLSEGDPWRK...")
print(state.output.xyz.shape)  # [D, L, 3]
```

## Workflows

| Workflow | Trigger inputs | What runs |
|----------|---------------|-----------|
| `fold` | `sequence` | Full structure prediction (recycling trunk + diffusion) |

### Fold a Sequence

```python
state = pipe(sequence="MKVLSEGDPWRK...", output_type="cif.gz", output_path="prediction")
print(state.output.atom_array)
```

### Full Design Pipeline

RF3 is typically used as a validation step after backbone design with [RFdiffusion3](https://huggingface.co/dn6/RFDiffusion-3):

```
RFD3 (design backbone) → MPNN (design sequence) → RF3 (validate fold)
```

```python
import torch
from diffusers import AutoModel, ModularPipeline

# 1. Design a backbone + sequence
design_pipe = ModularPipeline.from_pretrained("dn6/RFDiffusion-3", trust_remote_code=True)
design_pipe.load_components(device_map="cuda", torch_dtype=torch.bfloat16, trust_remote_code=True)

mpnn = AutoModel.from_pretrained("dn6/RFDiffusion-3", subfolder="mpnn", trust_remote_code=True)
design_pipe.update_components(mpnn=mpnn)

state = design_pipe(contigs="100", temperature=0.1)
designed_sequence = state.mpnn_output.designed_sequence

# 2. Validate the fold
fold_pipe = ModularPipeline.from_pretrained("dn6/RosettaFold-3", trust_remote_code=True)
fold_pipe.load_components(device_map="cuda", torch_dtype=torch.bfloat16, trust_remote_code=True)

state = fold_pipe(sequence=designed_sequence, output_type="cif.gz", output_path="prediction")
```

## Customizing Workflows

```python
# Inspect the pipeline structure
print(pipe.blocks)

# Add a custom block
from diffusers.modular_pipelines import ModularPipelineBlocks, PipelineState
from diffusers.modular_pipelines.modular_pipeline_utils import InputParam, OutputParam

class ComputeRadiusOfGyration(ModularPipelineBlocks):
    @property
    def inputs(self):
        return [InputParam("xyz", required=True)]

    @property
    def intermediate_outputs(self):
        return [OutputParam("radius_of_gyration")]

    def __call__(self, components, state):
        block_state = self.get_block_state(state)
        xyz = block_state.xyz
        centroid = xyz.mean(dim=-2, keepdim=True)
        block_state.radius_of_gyration = ((xyz - centroid) ** 2).sum(-1).mean().sqrt()
        self.set_block_state(state, block_state)
        return components, state

pipe._blocks.sub_blocks.insert("rog", ComputeRadiusOfGyration(), index=3)
```

## Output Types

| `output_type` | Additional output | Writes to disk |
|---|---|---|
| `"tensor"` | — | — |
| `"pdb"` | `pdb_string` | `.pdb` file |
| `"cif"` | `atom_array`, `atom_array_stack`, `trajectory_stack` | `.cif` via AtomWorks |
| `"cif.gz"` | Same as `"cif"` | `.cif.gz` compressed |

```python
# CIF output with AtomArray
state = pipe(sequence="MKVLSEG...", output_type="cif.gz", output_path="fold_0")
atom_array = state.output.atom_array

# Denoising trajectory
trajectory = state.output.trajectory_stack

# PDB output
state = pipe(sequence="MKVLSEG...", output_type="pdb", output_path="fold_0.pdb")
```

## Model Architecture

RF3 is a diffusion model with the same EDM noise schedule as RFdiffusion3 (200 steps), but conditioned on sequence/MSA/template representations from a large recycling trunk:

| Component | Subfolder | Description |
|-----------|-----------|-------------|
| `transformer` | `transformer/` | `RF3TransformerModel` (366M params) — FeatureInitializer + Recycler (48 pairformer blocks) + DiffusionModule (24 transformer blocks) + DistogramHead |
| `scheduler` | `scheduler/` | `RF3Scheduler` (EDM schedule, gamma_0=0.8) |

## Citation

```bibtex
@article{corley2025accelerating,
    author = {Corley, Nathaniel and Mathis, Simon and Krishna, Rohith and Bauer, Magnus S and Thompson, Tuscan R and Ahern, Woody and Kazman, Maxwell W and Brent, Rafael I and Didi, Kieran and Kubaney, Andrew and others},
    title = {Accelerating biomolecular modeling with AtomWorks and RF3},
    journal = {bioRxiv},
    year = {2025},
}
```