Add Prost40M model weights and metadata

Browse files

Files changed (6) hide show

LICENSE +3 -0
README.md +71 -0
config.json +43 -0
model.safetensors +3 -0
payload_manifest.json +34 -0
preprocessor_config.json +21 -0

LICENSE ADDED Viewed

	@@ -0,0 +1,3 @@

+Apache License
+Version 2.0, January 2004
+https://www.apache.org/licenses/LICENSE-2.0

README.md ADDED Viewed

	@@ -0,0 +1,71 @@

+# Prost40M
+**Prost40M** is a prostatectomy-specific foundation model pretrained with DINO on a large corpus of H&E prostatectomy slides.
+It is designed as a strong feature extractor for computational pathology tasks where subtle prostate-specific morphology matters.
+## Model At a Glance
+| Field | Value |
+| --- | --- |
+| Model name | Prost40M |
+| Backbone architecture | `vit_small` |
+| Input size | `224 x 224` |
+| Patch size | `14` |
+| Embedding dimension | `384` |
+| Released weights | Teacher backbone encoder |
+| Domain | H&E prostatectomy histopathology |
+## Quickstart
+```python
+import torch
+import timm
+from PIL import Image
+from timm.data import resolve_data_config
+from timm.data.transforms_factory import create_transform
+model = timm.create_model("hf-hub:waticlems/Prost40M", pretrained=True)
+model.eval()
+transform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model))
+img = Image.open("tile.png").convert("RGB")
+x = transform(img).unsqueeze(0)
+with torch.inference_mode():
+    embedding = model(x)  # shape: [1, 384]
+print(embedding.shape)
+```
+## Motivation
+Large pathology foundation models are typically trained on broad, multi-organ
+data. Their generic features transfer well across many settings, but can be less
+sensitive to fine-grained morphology of a specific organ. Prost40M was developed
+to evaluate the value of organ-specific pretraining in prostate histopathology.
+## Training Data
+- Approx. 40 million image tiles at `0.50` microns per pixel
+- 1888 H&E-stained prostatectomy slides
+  - 449 slides from 403 patients in the TCGA-PRAD cohort
+  - 1439 slides from 508 patients in the LEOPARD cohort
+## Intended Use
+- Tile-level feature extraction for downstream prostate histopathology tasks
+## Limitations
+- Performance can degrade under domain shift (scanner, stain protocol, center)
+- Learned representations reflect dataset composition and preprocessing choices
+## License
+Apache-2.0
+## Citation
+If you use **Prost40M**, cite:
+- _citation to be added soon_

config.json ADDED Viewed

	@@ -0,0 +1,43 @@

+{
+  "architecture": "vit_small_patch16_224",
+  "model_args": {
+    "patch_size": 14,
+    "img_size": 224,
+    "num_classes": 0
+  },
+  "num_classes": 0,
+  "pretrained_cfg": {
+    "architecture": "vit_small_patch16_224",
+    "custom_load": false,
+    "input_size": [
+      3,
+      224,
+      224
+    ],
+    "fixed_input_size": true,
+    "interpolation": "bicubic",
+    "crop_pct": 1.0,
+    "mean": [
+      0.485,
+      0.456,
+      0.406
+    ],
+    "std": [
+      0.229,
+      0.224,
+      0.225
+    ],
+    "first_conv": "patch_embed.proj",
+    "classifier": "head",
+    "num_features": 384,
+    "license": "apache-2.0",
+    "tags": [
+      "histopathology",
+      "self-supervised-learning",
+      "dino",
+      "vision-transformer",
+      "prostate",
+      "he-stain"
+    ]
+  }
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8dd486b73a30adbc6d8dc6419876d01ce19bf897ac3db1e86afa65323a0ab7f4
+size 86492048

payload_manifest.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+  "repo_id": "waticlems/Prost40M",
+  "artifact_dir": "/data/pathology/projects/clement/models/hf/Prost40M",
+  "payload_dir": "/data/pathology/projects/clement/models/hf/Prost40M/payload",
+  "license": "apache-2.0",
+  "private": false,
+  "built_at_utc": "2026-02-20T15:09:01.002806+00:00",
+  "files": [
+    "LICENSE",
+    "README.md",
+    "config.json",
+    "model.safetensors",
+    "payload_manifest.json",
+    "preprocessor_config.json"
+  ],
+  "timm_validation": {
+    "model_name": "vit_small_patch16_224",
+    "model_args": {
+      "patch_size": 14,
+      "img_size": 224,
+      "num_classes": 0
+    },
+    "input_shape": [
+      1,
+      3,
+      224,
+      224
+    ],
+    "output_shape": [
+      1,
+      384
+    ]
+  }
+}

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,21 @@

+{
+  "do_resize": false,
+  "do_center_crop": true,
+  "crop_size": {
+    "height": 224,
+    "width": 224
+  },
+  "do_rescale": false,
+  "do_normalize": true,
+  "image_mean": [
+    0.485,
+    0.456,
+    0.406
+  ],
+  "image_std": [
+    0.229,
+    0.224,
+    0.225
+  ],
+  "do_convert_rgb": true
+}