| | --- |
| | license: mit |
| | library_name: pytorch |
| | pipeline_tag: feature-extraction |
| | tags: |
| | - protein |
| | - structural-biology |
| | - representation-learning |
| | - 3d-cnn |
| | - foldvision |
| | --- |
| | |
| | # FoldVision Encoder |
| |
|
| | ## Model Summary |
| |
|
| | FoldVision is a protein 3D-CNN encoder that maps a voxelized protein structure to a fixed-size embedding (`1024` dimensions). |
| |
|
| | Primary task: |
| | - **Protein feature extraction** from 3D structure. |
| |
|
| | Typical downstream tasks (with finetuning heads): |
| | - Protein-only regression/classification. |
| | - PSI (**protein-small molecule interactions**) prediction when combined with a SMILES encoder. |
| |
|
| | GitHub code: [foldvision_github](https://github.com/AlexanderKroll/foldvision) |
| |
|
| | ## Model Details |
| |
|
| | - Model name: `AlexanderKroll/foldvision-encoder` |
| | - Architecture: 3D CNN encoder with GroupNorm blocks and global pooling. |
| | - Framework: PyTorch |
| | - Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`) |
| | - Output: `(B, 1024)` embedding |
| |
|
| |
|
| | ## Input and Preprocessing |
| |
|
| | This model expects FoldVision voxel tensors generated from PDB structures. |
| |
|
| | Recommended preprocessing pipeline: |
| | 1. Convert `.pdb` files to sparse point lists (`numpy_3D_point_lists/*.npz`). |
| | 2. Use `bounding_boxes.npy` + dataloader to construct dense tensors at runtime. |
| |
|
| | Repository scripts: |
| | - `scripts/preprocess_pdb_dir.py` |
| | - `scripts/embed_proteins.py` |
| | - `scripts/train.py` |
| | - `scripts/train_PSI.py` |
| | - `scripts/evaluate.py` |
| | - `scripts/evaluate_PSI.py` |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from foldvision import FoldVisionEncoder |
| | |
| | model = FoldVisionEncoder.from_pretrained("AlexanderKroll/foldvision-encoder") |
| | model.eval() |
| | # x: (B, 5, Z, Y, X) |
| | # z = model(x) # (B, 1024) |
| | ``` |
| |
|
| | ## Multi-Run Embeddings and Predictions |
| |
|
| | FoldVision pipelines support repeated runs with random 3D rotations (test-time augmentation). |
| |
|
| | - Embeddings: |
| | - per-run: keep each run-specific embedding, |
| | - aggregated: use mean embedding for a stable representation. |
| | - Predictions: |
| | - per-run predictions can be used to inspect spread/uncertainty, |
| | - averaged predictions are recommended for reporting. |
| |
|
| |
|
| | ## Citation |
| |
|
| | If you use this model, cite: |
| |
|
| | 1. **FoldVision bioRxiv manuscript**: |
| |
|
| | ```bibtex |
| | @article{foldvision_biorxiv, |
| | title = {FoldVision: A compute-efficient atom-level 3D protein encoder}, |
| | author = {Kroll, Alexander and Yadav, Shantanu and Lercher, Martin J.}, |
| | journal = {bioRxiv}, |
| | year = {2026}, |
| | doi = {10.64898/2026.01.23.701326}, |
| | url = {https://doi.org/10.64898/2026.01.23.701326} |
| | } |
| | ``` |
| |
|
| | 2. The GitHub repository: |
| |
|
| | ```bibtex |
| | @misc{foldvision_github, |
| | title = {FoldVision code repository}, |
| | author = {Kroll, Alexander}, |
| | year = {2026}, |
| | howpublished = {\url{https://github.com/AlexanderKroll/foldvision}} |
| | } |
| | ``` |
| |
|
| | ## Model Card Contact |
| |
|
| | For issues or questions, use the GitHub issue tracker in the FoldVision repository. |
| |
|