File size: 2,804 Bytes
8cec382
248ae61
 
8cec382
 
248ae61
 
 
 
 
8cec382
 
248ae61
 
 
 
 
 
 
 
 
 
 
 
 
7372d4e
248ae61
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
license: mit
library_name: pytorch
pipeline_tag: feature-extraction
tags:
  - protein
  - structural-biology
  - representation-learning
  - 3d-cnn
  - foldvision
---

# FoldVision Encoder

## Model Summary

FoldVision is a protein 3D-CNN encoder that maps a voxelized protein structure to a fixed-size embedding (`1024` dimensions).

Primary task:
- **Protein feature extraction** from 3D structure.

Typical downstream tasks (with finetuning heads):
- Protein-only regression/classification.
- PSI (**protein-small molecule interactions**) prediction when combined with a SMILES encoder.

GitHub code: [foldvision_github](https://github.com/AlexanderKroll/foldvision)

## Model Details

- Model name: `AlexanderKroll/foldvision-encoder`
- Architecture: 3D CNN encoder with GroupNorm blocks and global pooling.
- Framework: PyTorch
- Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`)
- Output: `(B, 1024)` embedding


## Input and Preprocessing

This model expects FoldVision voxel tensors generated from PDB structures.

Recommended preprocessing pipeline:
1. Convert `.pdb` files to sparse point lists (`numpy_3D_point_lists/*.npz`).
2. Use `bounding_boxes.npy` + dataloader to construct dense tensors at runtime.

Repository scripts:
- `scripts/preprocess_pdb_dir.py`
- `scripts/embed_proteins.py`
- `scripts/train.py`
- `scripts/train_PSI.py`
- `scripts/evaluate.py`
- `scripts/evaluate_PSI.py`

## Usage

```python
from foldvision import FoldVisionEncoder

model = FoldVisionEncoder.from_pretrained("AlexanderKroll/foldvision-encoder")
model.eval()
# x: (B, 5, Z, Y, X)
# z = model(x)  # (B, 1024)
```

## Multi-Run Embeddings and Predictions

FoldVision pipelines support repeated runs with random 3D rotations (test-time augmentation).

- Embeddings:
  - per-run: keep each run-specific embedding,
  - aggregated: use mean embedding for a stable representation.
- Predictions:
  - per-run predictions can be used to inspect spread/uncertainty,
  - averaged predictions are recommended for reporting.


## Citation

If you use this model, cite:

1. **FoldVision bioRxiv manuscript**:

```bibtex
@article{foldvision_biorxiv,
  title   = {FoldVision: A compute-efficient atom-level 3D protein encoder},
  author  = {Kroll, Alexander and Yadav, Shantanu and Lercher, Martin J.},
  journal = {bioRxiv},
  year    = {2026},
  doi     = {10.64898/2026.01.23.701326},
  url     = {https://doi.org/10.64898/2026.01.23.701326}
}
```

2. The GitHub repository:

```bibtex
@misc{foldvision_github,
  title        = {FoldVision code repository},
  author       = {Kroll, Alexander},
  year         = {2026},
  howpublished = {\url{https://github.com/AlexanderKroll/foldvision}}
}
```

## Model Card Contact

For issues or questions, use the GitHub issue tracker in the FoldVision repository.