majiabo
/

GPFM

Feature Extraction

computer-vision

foundation-model

Model card Files Files and versions

GPFM / README.md

majiabo's picture

Update README.md

c2e0c6c verified about 2 months ago

|

history blame contribute delete

2.86 kB

	---
	library_name: timm
	license: mit
	tags:
	- pathology
	- histopathology
	- feature-extraction
	- computer-vision
	- vit
	- dinov2
	- pytorch
	- foundation-model
	pipeline_tag: feature-extraction
	model-index:
	- name: GPFM
	results: []
	---

	# GPFM: Generalizable Pathology Foundation Model

	GPFM is a pathology foundation model based on ViT-L/14 (DINOv2 configuration) for extracting general visual features from histopathology tiles/patches. It supports downstream WSI tasks including MIL classification, prognosis, survival analysis, and ROI tasks. The model architecture corresponds to `vit_large_patch14_dinov2.lvd142m` in `timm`, using DINOv2's key hyperparameters (`img_size=224`, `init_values=1e-5`).

	Paper and Resources:
	- Paper: https://arxiv.org/abs/2407.18449
	- Project: https://github.com/birkhoffkiki/GPFM

	## Model Overview
	- Architecture: ViT-L/14 (DINOv2 configuration)
	- Input: RGB images, recommended original tile size ~512×512 (40×), preprocessed and resized to 224×224
	- Normalization: ImageNet mean and std (see preprocessing below)
	- Feature dimension: 1024 (`model(x)` outputs `[N, 1024]`)
	- Use cases: General pathology tile feature extraction and downstream task transfer

	## Quick Start (Minimal Example with timm + huggingface_hub)

	```python
	import torch
	from PIL import Image
	import timm
	from huggingface_hub import hf_hub_download
	from torchvision import transforms

	# 1) Download weights
	ckpt_path = hf_hub_download(repo_id="majiabo/GPFM", filename="GPFM.pth")

	# 2) Build ViT-L/14 (DINOv2 config) model
	model = timm.create_model(
	'vit_large_patch14_dinov2.lvd142m',
	pretrained=False,
	img_size=224,
	init_values=1.0e-05,
	)
	state_dict = torch.load(ckpt_path, map_location='cpu')
	model.load_state_dict(state_dict, strict=True)
	model.eval()

	# 3) Preprocessing (consistent with GPFM project)
	mean = (0.485, 0.456, 0.406)
	std = (0.229, 0.224, 0.225)
	transform = transforms.Compose([
	transforms.Resize((224, 224), interpolation=transforms.InterpolationMode.BICUBIC),
	transforms.ToTensor(),
	transforms.Normalize(mean=mean, std=std),
	])

	# 4) Extract features
	img = Image.open('your_tile_512x512.jpg').convert('RGB')
	x = transform(img).unsqueeze(0) # [1, 3, 224, 224]
	with torch.no_grad():
	feat = model(x) # [1, 1024]
	print(feat.shape)
	```

	Dependencies:

	```bash
	pip install torch torchvision timm huggingface_hub pillow
	```


	## Citation

	```
	@article{ma2025generalizable,
	title={A generalizable pathology foundation model using a unified knowledge distillation pretraining framework},
	author={Ma, Jiabo and Guo, Zhengrui and Zhou, Fengtao and Wang, Yihui and Xu, Yingxue and Li, Jinbang and Yan, Fang and Cai, Yu and Zhu, Zhengjie and Jin, Cheng and others},
	journal={Nature Biomedical Engineering},
	pages={1--20},
	year={2025},
	publisher={Nature Publishing Group UK London}
	}
	```