--- license: apache-2.0 library_name: timm tags: - histopathology - pathology - dino - vision-transformer - prostate - feature-extraction pipeline_tag: image-feature-extraction --- # Prost40M **Prost40M** is a prostatectomy-specific foundation model pretrained with DINO on a large corpus of H&E prostatectomy slides. It is designed as a strong feature extractor for computational pathology tasks where subtle prostate-specific morphology matters. ## Model At a Glance | Field | Value | | --- | --- | | Model name | Prost40M | | Backbone architecture | `vit_small` | | Input size | `224 x 224` | | Patch size | `14` | | Embedding dimension | `384` | | Released weights | Teacher backbone encoder | | Domain | H&E prostatectomy histopathology | ## Quickstart ```python import torch import timm from PIL import Image from timm.data import resolve_data_config from timm.data.transforms_factory import create_transform model = timm.create_model("hf-hub:waticlems/Prost40M", pretrained=True) model.eval() transform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model)) img = Image.open("tile.png").convert("RGB") x = transform(img).unsqueeze(0) with torch.inference_mode(): embedding = model(x) # shape: [1, 384] print(embedding.shape) ``` ## Motivation Large pathology foundation models are typically trained on broad, multi-organ data. Their generic features transfer well across many settings, but can be less sensitive to fine-grained morphology of a specific organ. Prost40M was developed to evaluate the value of organ-specific pretraining in prostate histopathology. ## Training Data - Approx. 40 million image tiles at `0.50` microns per pixel - 1888 H&E-stained prostatectomy slides - 449 slides from 403 patients in the TCGA-PRAD cohort - 1439 slides from 508 patients in the LEOPARD cohort ## Intended Use - Tile-level feature extraction for downstream prostate histopathology tasks ## Limitations - Performance can degrade under domain shift (scanner, stain protocol, center) - Learned representations reflect dataset composition and preprocessing choices ## License Apache-2.0 ## Citation If you use **Prost40M**, cite: - _citation to be added soon_