File size: 2,804 Bytes
8cec382 248ae61 8cec382 248ae61 8cec382 248ae61 7372d4e 248ae61 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | ---
license: mit
library_name: pytorch
pipeline_tag: feature-extraction
tags:
- protein
- structural-biology
- representation-learning
- 3d-cnn
- foldvision
---
# FoldVision Encoder
## Model Summary
FoldVision is a protein 3D-CNN encoder that maps a voxelized protein structure to a fixed-size embedding (`1024` dimensions).
Primary task:
- **Protein feature extraction** from 3D structure.
Typical downstream tasks (with finetuning heads):
- Protein-only regression/classification.
- PSI (**protein-small molecule interactions**) prediction when combined with a SMILES encoder.
GitHub code: [foldvision_github](https://github.com/AlexanderKroll/foldvision)
## Model Details
- Model name: `AlexanderKroll/foldvision-encoder`
- Architecture: 3D CNN encoder with GroupNorm blocks and global pooling.
- Framework: PyTorch
- Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`)
- Output: `(B, 1024)` embedding
## Input and Preprocessing
This model expects FoldVision voxel tensors generated from PDB structures.
Recommended preprocessing pipeline:
1. Convert `.pdb` files to sparse point lists (`numpy_3D_point_lists/*.npz`).
2. Use `bounding_boxes.npy` + dataloader to construct dense tensors at runtime.
Repository scripts:
- `scripts/preprocess_pdb_dir.py`
- `scripts/embed_proteins.py`
- `scripts/train.py`
- `scripts/train_PSI.py`
- `scripts/evaluate.py`
- `scripts/evaluate_PSI.py`
## Usage
```python
from foldvision import FoldVisionEncoder
model = FoldVisionEncoder.from_pretrained("AlexanderKroll/foldvision-encoder")
model.eval()
# x: (B, 5, Z, Y, X)
# z = model(x) # (B, 1024)
```
## Multi-Run Embeddings and Predictions
FoldVision pipelines support repeated runs with random 3D rotations (test-time augmentation).
- Embeddings:
- per-run: keep each run-specific embedding,
- aggregated: use mean embedding for a stable representation.
- Predictions:
- per-run predictions can be used to inspect spread/uncertainty,
- averaged predictions are recommended for reporting.
## Citation
If you use this model, cite:
1. **FoldVision bioRxiv manuscript**:
```bibtex
@article{foldvision_biorxiv,
title = {FoldVision: A compute-efficient atom-level 3D protein encoder},
author = {Kroll, Alexander and Yadav, Shantanu and Lercher, Martin J.},
journal = {bioRxiv},
year = {2026},
doi = {10.64898/2026.01.23.701326},
url = {https://doi.org/10.64898/2026.01.23.701326}
}
```
2. The GitHub repository:
```bibtex
@misc{foldvision_github,
title = {FoldVision code repository},
author = {Kroll, Alexander},
year = {2026},
howpublished = {\url{https://github.com/AlexanderKroll/foldvision}}
}
```
## Model Card Contact
For issues or questions, use the GitHub issue tracker in the FoldVision repository.
|