|
|
--- |
|
|
library_name: timm |
|
|
license: mit |
|
|
tags: |
|
|
- pathology |
|
|
- histopathology |
|
|
- feature-extraction |
|
|
- computer-vision |
|
|
- vit |
|
|
- dinov2 |
|
|
- pytorch |
|
|
- foundation-model |
|
|
pipeline_tag: feature-extraction |
|
|
model-index: |
|
|
- name: GPFM |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
# GPFM: Generalizable Pathology Foundation Model |
|
|
|
|
|
GPFM is a pathology foundation model based on ViT-L/14 (DINOv2 configuration) for extracting general visual features from histopathology tiles/patches. It supports downstream WSI tasks including MIL classification, prognosis, survival analysis, and ROI tasks. The model architecture corresponds to `vit_large_patch14_dinov2.lvd142m` in `timm`, using DINOv2's key hyperparameters (`img_size=224`, `init_values=1e-5`). |
|
|
|
|
|
Paper and Resources: |
|
|
- Paper: https://arxiv.org/abs/2407.18449 |
|
|
- Project: https://github.com/birkhoffkiki/GPFM |
|
|
|
|
|
## Model Overview |
|
|
- Architecture: ViT-L/14 (DINOv2 configuration) |
|
|
- Input: RGB images, recommended original tile size ~512×512 (40×), preprocessed and resized to 224×224 |
|
|
- Normalization: ImageNet mean and std (see preprocessing below) |
|
|
- Feature dimension: 1024 (`model(x)` outputs `[N, 1024]`) |
|
|
- Use cases: General pathology tile feature extraction and downstream task transfer |
|
|
|
|
|
## Quick Start (Minimal Example with timm + huggingface_hub) |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from PIL import Image |
|
|
import timm |
|
|
from huggingface_hub import hf_hub_download |
|
|
from torchvision import transforms |
|
|
|
|
|
# 1) Download weights |
|
|
ckpt_path = hf_hub_download(repo_id="majiabo/GPFM", filename="GPFM.pth") |
|
|
|
|
|
# 2) Build ViT-L/14 (DINOv2 config) model |
|
|
model = timm.create_model( |
|
|
'vit_large_patch14_dinov2.lvd142m', |
|
|
pretrained=False, |
|
|
img_size=224, |
|
|
init_values=1.0e-05, |
|
|
) |
|
|
state_dict = torch.load(ckpt_path, map_location='cpu') |
|
|
model.load_state_dict(state_dict, strict=True) |
|
|
model.eval() |
|
|
|
|
|
# 3) Preprocessing (consistent with GPFM project) |
|
|
mean = (0.485, 0.456, 0.406) |
|
|
std = (0.229, 0.224, 0.225) |
|
|
transform = transforms.Compose([ |
|
|
transforms.Resize((224, 224), interpolation=transforms.InterpolationMode.BICUBIC), |
|
|
transforms.ToTensor(), |
|
|
transforms.Normalize(mean=mean, std=std), |
|
|
]) |
|
|
|
|
|
# 4) Extract features |
|
|
img = Image.open('your_tile_512x512.jpg').convert('RGB') |
|
|
x = transform(img).unsqueeze(0) # [1, 3, 224, 224] |
|
|
with torch.no_grad(): |
|
|
feat = model(x) # [1, 1024] |
|
|
print(feat.shape) |
|
|
``` |
|
|
|
|
|
Dependencies: |
|
|
|
|
|
```bash |
|
|
pip install torch torchvision timm huggingface_hub pillow |
|
|
``` |
|
|
|
|
|
|
|
|
## Citation |
|
|
|
|
|
``` |
|
|
@article{ma2025generalizable, |
|
|
title={A generalizable pathology foundation model using a unified knowledge distillation pretraining framework}, |
|
|
author={Ma, Jiabo and Guo, Zhengrui and Zhou, Fengtao and Wang, Yihui and Xu, Yingxue and Li, Jinbang and Yan, Fang and Cai, Yu and Zhu, Zhengjie and Jin, Cheng and others}, |
|
|
journal={Nature Biomedical Engineering}, |
|
|
pages={1--20}, |
|
|
year={2025}, |
|
|
publisher={Nature Publishing Group UK London} |
|
|
} |
|
|
``` |
|
|
|