AlexanderKroll
/

foldvision-encoder

Feature Extraction

structural-biology

representation-learning

Model card Files Files and versions

foldvision-encoder / README.md

AlexanderKroll's picture

Update README.md

7372d4e verified 24 days ago

|

history blame contribute delete

2.8 kB

	---
	license: mit
	library_name: pytorch
	pipeline_tag: feature-extraction
	tags:
	- protein
	- structural-biology
	- representation-learning
	- 3d-cnn
	- foldvision
	---

	# FoldVision Encoder

	## Model Summary

	FoldVision is a protein 3D-CNN encoder that maps a voxelized protein structure to a fixed-size embedding (`1024` dimensions).

	Primary task:
	- Protein feature extraction from 3D structure.

	Typical downstream tasks (with finetuning heads):
	- Protein-only regression/classification.
	- PSI (protein-small molecule interactions) prediction when combined with a SMILES encoder.

	GitHub code: [foldvision_github](https://github.com/AlexanderKroll/foldvision)

	## Model Details

	- Model name: `AlexanderKroll/foldvision-encoder`
	- Architecture: 3D CNN encoder with GroupNorm blocks and global pooling.
	- Framework: PyTorch
	- Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`)
	- Output: `(B, 1024)` embedding


	## Input and Preprocessing

	This model expects FoldVision voxel tensors generated from PDB structures.

	Recommended preprocessing pipeline:
	1. Convert `.pdb` files to sparse point lists (`numpy_3D_point_lists/*.npz`).
	2. Use `bounding_boxes.npy` + dataloader to construct dense tensors at runtime.

	Repository scripts:
	- `scripts/preprocess_pdb_dir.py`
	- `scripts/embed_proteins.py`
	- `scripts/train.py`
	- `scripts/train_PSI.py`
	- `scripts/evaluate.py`
	- `scripts/evaluate_PSI.py`

	## Usage

	```python
	from foldvision import FoldVisionEncoder

	model = FoldVisionEncoder.from_pretrained("AlexanderKroll/foldvision-encoder")
	model.eval()
	# x: (B, 5, Z, Y, X)
	# z = model(x) # (B, 1024)
	```

	## Multi-Run Embeddings and Predictions

	FoldVision pipelines support repeated runs with random 3D rotations (test-time augmentation).

	- Embeddings:
	- per-run: keep each run-specific embedding,
	- aggregated: use mean embedding for a stable representation.
	- Predictions:
	- per-run predictions can be used to inspect spread/uncertainty,
	- averaged predictions are recommended for reporting.


	## Citation

	If you use this model, cite:

	1. FoldVision bioRxiv manuscript:

	```bibtex
	@article{foldvision_biorxiv,
	title = {FoldVision: A compute-efficient atom-level 3D protein encoder},
	author = {Kroll, Alexander and Yadav, Shantanu and Lercher, Martin J.},
	journal = {bioRxiv},
	year = {2026},
	doi = {10.64898/2026.01.23.701326},
	url = {https://doi.org/10.64898/2026.01.23.701326}
	}
	```

	2. The GitHub repository:

	```bibtex
	@misc{foldvision_github,
	title = {FoldVision code repository},
	author = {Kroll, Alexander},
	year = {2026},
	howpublished = {\url{https://github.com/AlexanderKroll/foldvision}}
	}
	```

	## Model Card Contact

	For issues or questions, use the GitHub issue tracker in the FoldVision repository.