---
library_name: timm
license: mit
tags:
- pathology
- histopathology
- feature-extraction
- computer-vision
- vit
- dinov2
- pytorch
- foundation-model
pipeline_tag: feature-extraction
model-index:
- name: GPFM
  results: []
---

# GPFM: Generalizable Pathology Foundation Model

GPFM is a pathology foundation model based on ViT-L/14 (DINOv2 configuration) for extracting general visual features from histopathology tiles/patches. It supports downstream WSI tasks including MIL classification, prognosis, survival analysis, and ROI tasks. The model architecture corresponds to `vit_large_patch14_dinov2.lvd142m` in `timm`, using DINOv2's key hyperparameters (`img_size=224`, `init_values=1e-5`).

Paper and Resources:
- Paper: https://arxiv.org/abs/2407.18449
- Project: https://github.com/birkhoffkiki/GPFM

## Model Overview
- Architecture: ViT-L/14 (DINOv2 configuration)
- Input: RGB images, recommended original tile size ~512×512 (40×), preprocessed and resized to 224×224
- Normalization: ImageNet mean and std (see preprocessing below)
- Feature dimension: 1024 (`model(x)` outputs `[N, 1024]`)
- Use cases: General pathology tile feature extraction and downstream task transfer

## Quick Start (Minimal Example with timm + huggingface_hub)

```python
import torch
from PIL import Image
import timm
from huggingface_hub import hf_hub_download
from torchvision import transforms

# 1) Download weights
ckpt_path = hf_hub_download(repo_id="majiabo/GPFM", filename="GPFM.pth")

# 2) Build ViT-L/14 (DINOv2 config) model
model = timm.create_model(
    'vit_large_patch14_dinov2.lvd142m',
    pretrained=False,
    img_size=224,
    init_values=1.0e-05,
)
state_dict = torch.load(ckpt_path, map_location='cpu')
model.load_state_dict(state_dict, strict=True)
model.eval()

# 3) Preprocessing (consistent with GPFM project)
mean = (0.485, 0.456, 0.406)
std  = (0.229, 0.224, 0.225)
transform = transforms.Compose([
    transforms.Resize((224, 224), interpolation=transforms.InterpolationMode.BICUBIC),
    transforms.ToTensor(),
    transforms.Normalize(mean=mean, std=std),
])

# 4) Extract features
img = Image.open('your_tile_512x512.jpg').convert('RGB')
x = transform(img).unsqueeze(0)   # [1, 3, 224, 224]
with torch.no_grad():
    feat = model(x)               # [1, 1024]
print(feat.shape)
```

Dependencies:

```bash
pip install torch torchvision timm huggingface_hub pillow
```


## Citation

```
@article{ma2025generalizable,
  title={A generalizable pathology foundation model using a unified knowledge distillation pretraining framework},
  author={Ma, Jiabo and Guo, Zhengrui and Zhou, Fengtao and Wang, Yihui and Xu, Yingxue and Li, Jinbang and Yan, Fang and Cai, Yu and Zhu, Zhengjie and Jin, Cheng and others},
  journal={Nature Biomedical Engineering},
  pages={1--20},
  year={2025},
  publisher={Nature Publishing Group UK London}
}
```