YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Mag-DINO

Title

Mag-DINO: Hierarchical Multi-Magnification Self-Supervised Learning (20x/40x) for Prostate Cancer Gleason Grading

Summary. This repository provides pretrained self-supervised Vision Transformer (ViT) models (DINO-v1/v2) for automated Gleason grade classification from prostate cancer whole-slide images (WSIs). Beyond high accuracy, the models emphasize reproducibility and potential clinical decision-support value by addressing inter-observer variability and workload in pathology.

Description

Mag-DINO is a hierarchical, multi-magnification histopathology pipeline that combines information from 20x and 40x views of prostate tissue. The framework uses self-supervised representation learning (DINO family) to obtain robust visual embeddings from WSI-derived patches, then trains downstream classifiers for Gleason grade prediction. The central idea is to capture both broader glandular context (lower magnification) and fine cellular morphology (higher magnification) within a unified workflow.

Dataset Information

Primary dataset: TCGA-PRAD (The Cancer Genome Atlas - Prostate Adenocarcinoma)
Scale used in this study: 403 patients, 449 diagnostic slides, 81,126 tiles (224x224)
Task: multi-class Gleason pattern/grade group classification
Split strategy: patient-level split (80% train / 20% test) to prevent leakage
Data access/reference: please cite the dataset DOI/reference used in your manuscript and include the same citation in downstream publications.

🩺 Clinical Motivation & Impact

Manual Gleason grading is time-consuming and subject to inter-observer variability, which can affect diagnostic consistency and treatment planning. By learning label-efficient, stain-robust representations with self-supervised learning (SSL), these models aim to:

improve grading reproducibility,
reduce pathologist workload,
and facilitate timely, consistent decision-making as part of a decision-support pipeline.

Note: These models are intended for research and development of clinical decision-support systems; they are not cleared for direct clinical use.

📝 Model Description

Framework: DINO (self-distillation without labels)
Backbones evaluated: ViT-B/16 (DINO-v1), ViT-L/14 (DINO-v2)
Downstream heads: MLP (best), k-NN, CNN heads (e.g., DenseNet)
Training objective: SSL pretraining on histology tiles, followed by supervised fine-tuning for Gleason classes
Library: PyTorch / Hugging Face Transformers

🧪 Data & Training

Dataset: TCGA-PRAD
Cases / slides / patches: 403 patients, 449 diagnostic slides, 81,126 224×224 tiles
Split: patient-level (no patient leakage) — 80% train / 20% test
Preprocessing: standard WSI tiling; color/stain variability present in TCGA
Goal: multi-class Gleason grade classification

Classes

3+3, 3+4, 3+5, 4+3, 4+4, 4+5, 5+3, 5+4, 5+5

📈 Results

Best overall configuration: DINO-v1 ViT-B/16 + MLP

Model	Backbone	Classifier	Accuracy	Precision	Recall	F1
DINO-v1 (ViT-B/16)	B/16	MLP	90.40	90.40	90.39	90.36
DINO-v1 (ViT-B/16)	B/16	k-NN	89.36	89.19	89.38	89.16
DINO-v2 (ViT-L/14)	L/14	MLP	83.31	83.16	83.31	83.17
CNN (DenseNet)	–	FC head	87.86	87.17	87.86	87.20

ROC-AUC: 0.991 (MLP)
Agreement: Cohen’s κ = 0.89 (substantial)
Feature analysis: SSL features showed robustness to stain variation & histologic heterogeneity; features 11 & 13 were most discriminative (feature 19 weak).

Interpretation. SSL-derived ViT features outperformed supervised baselines and were more robust to known sources of variability in computational pathology—supporting reproducible grading and potential clinical workflow integration (after appropriate validation).

🚀 Key Contributions

A unified multi-SSL pipeline (DINO-v1/v2, iBOT, token registration) for Gleason grading.
Evidence that SSL features improve robustness/generalization over supervised baselines.
Feature-level statistical validation (e.g., ANOVA, discriminant power).
Pretrained weights released for reproducibility and benchmarking.

🩸 Intended Use

Research use only. Suitable for:

prototyping automated pathology tools,
benchmarking histopathology classifiers,
exploring self-supervised learning in medical imaging,
building decision-support pipelines (with additional validation).

⚠️ Not for clinical use. External, multi-center validation and regulatory clearance are required prior to any deployment impacting patient care.

⚖️ Limitations & Ethical Considerations

Domain shift: Trained on TCGA-PRAD; performance may vary with scanner type, staining protocol, lab workflow, or demographics.
Generalization: Requires multi-institutional external validation.
Fairness & bias: Assess subgroup performance before deployment.
Human-in-the-loop: Models should augment, not replace, expert pathology review.

🔬 How to Use

from transformers import AutoImageProcessor, ViTForImageClassification
from PIL import Image
import torch

model_id = "buseyaren/self-supervised-prostate-cancer"

processor = AutoImageProcessor.from_pretrained(model_id)
model = ViTForImageClassification.from_pretrained(model_id)
model.eval()

img = Image.open("example_tile.png").convert("RGB")
inputs = processor(images=img, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits
    probs = logits.softmax(dim=-1)
    pred_id = probs.argmax(dim=-1).item()
    pred_score = probs[0, pred_id].item()

id2label = model.config.id2label if hasattr(model.config, "id2label") else {}
print("Predicted class:", id2label.get(pred_id, pred_id), f"(p={pred_score:.3f})")

Code Information

Repository scripts are organized under codes/:

codes/extraction-wsis/: WSI download/filtering, slide-level views, tissue-aware patch extraction.
codes/training/: DINO/ViT self-supervised training utilities.
codes/feature-extraction/: embedding extraction from trained/self-supervised backbones.
codes/classification/: downstream classifiers (MLP, k-NN, logistic regression, weighted variants).
codes/evaluation/: evaluation scripts for linear/k-NN and related metrics.

Representative scripts:

codes/extraction-wsis/automatic_download_aws.py
codes/extraction-wsis/crop_20x_50percent.py
codes/feature-extraction/extract_features_andknn.py
codes/classification/mlp_classifier.py
codes/evaluation/eval_linear.py
codes/training/main_dino.py

Usage Instructions

1) Environment setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

If no requirements.txt is provided in your local copy, install core packages manually (see Requirements section).

2) Data preparation (WSI and patch extraction)

python codes/extraction-wsis/automatic_download_aws.py
python codes/extraction-wsis/data_filtering.py
python codes/extraction-wsis/crop_20x_50percent.py

3) Self-supervised training (optional, for reproduction)

python codes/training/main_dino.py

4) Feature extraction

python codes/feature-extraction/extract_features_andknn.py

5) Downstream training and evaluation

python codes/classification/mlp_classifier.py
python codes/evaluation/eval_linear.py

6) Hugging Face inference with released model weights

Use the Python inference snippet in the How to Use section above.

Requirements

Core dependencies:

Python >= 3.9
PyTorch
torchvision
transformers
numpy
scikit-learn
pandas
Pillow
matplotlib
OpenSlide (for WSI reading/extraction scripts)

System-level note: OpenSlide may require OS-level installation (e.g., libopenslide) in addition to Python bindings.

Methodology

WSI curation and filtering: collect diagnostic slides and metadata, then apply quality/label filtering.
Hierarchical multi-magnification patching: extract tissue-relevant patches at 20x/40x views.
Self-supervised representation learning: pretrain/transfer DINO-based ViT encoders on histology data.
Feature extraction: generate fixed embeddings for train/test cohorts.
Downstream classification: train MLP, k-NN, and linear baselines on extracted embeddings.
Evaluation and analysis: report Accuracy/Precision/Recall/F1, ROC-AUC, agreement metrics (Cohen's kappa), and feature-level statistics.

🧩 Pipeline Scripts (optional)

In addition to the pretrained Hugging Face model, this repository also includes research scripts under codes/ to reproduce parts of the end-to-end pipeline:

codes/extraction-wsis/: WSI (whole-slide image) preparation, filtering, and patch/overview extraction (e.g., slide downloading, slide-level views, and tissue-aware patch extraction).
codes/training/: DINO / ViT self-supervised pretraining code.
codes/feature-extraction/: embedding/feature extraction from images using a DINO checkpoint (and related utilities).
codes/classification/: classical downstream classifiers trained on extracted embeddings (e.g., MLP / k-NN variants, logistic regression, etc.).
codes/evaluation/: evaluation utilities for the downstream classifiers.

These scripts are research prototypes and may require you to edit hard-coded paths/parameters at the top of each file to match your dataset layout and compute environment.

Representative scripts (edit paths/params in the files)

codes/extraction-wsis/automatic_download_aws.py: download TCGA WSI files from S3 URLs (uses curl).
codes/extraction-wsis/data_filtering.py: filter a TCGA CSV and copy .svs files; can also organize by gleason_grade.
codes/extraction-wsis/whole_view_wsis.py and codes/extraction-wsis/extract_slide_level.py: create low-magnification overview images and inspect slide metadata.
codes/extraction-wsis/crop_20x_50percent.py: OpenSlide-based tissue-aware patch extraction (saves 256x256 patches).
codes/feature-extraction/extract_features_andknn.py: extract DINO teacher embeddings and train/evaluate a k-NN classifier.
codes/classification/: downstream classifiers trained on extracted embeddings (mlp_classifier.py, knn_classifier*.py, mlp_classifier_weighted.py, logistic_regression_classifier.py).
codes/evaluation/: evaluation utilities (eval_linear.py, eval_knn.py).
codes/training/main_dino.py: DINO-style ViT self-supervised pretraining (requires DINO/ViT dependencies such as vision_transformer).

Minimal workflow (high level)

Prepare your WSIs locally (and update paths inside codes/extraction-wsis/ scripts).
Extract tissue-aware patches (256x256) or create the intermediate images required by your setup.
Extract embeddings/features from patches using a DINO checkpoint (codes/feature-extraction/).
Train and evaluate classical classifiers on the embeddings (codes/classification/ and codes/evaluation/).

Notes

OpenSlide is typically required for WSI reading in the extraction scripts.
Use patient-level splitting to reduce the risk of data leakage.
Downstream classifiers expect precomputed artifacts with specific filenames. codes/classification/mlp_classifier.py and codes/classification/logistic_regression_classifier.py use these hard-coded relative filenames, so the files must be reachable from the script’s working directory (or you must edit the paths in the scripts): features_train_epoch64.npy, labels_train_epoch64.npy, features_test_epoch64.npy, labels_test_epoch64.npy, case_ids_train.pkl, case_ids_test.pkl.
You can rename the top-level folder (e.g., features/ -> embeddings/ or any other name), but you must either run the classifier script from the folder containing the files listed above, or update train_feat_path, train_lab_path, test_feat_path, test_lab_path, train_case_ids_path, and test_case_ids_path at the top of the script.
Feature consistency check: codes/feature-extraction/check_features.py currently checks features_train_epoch140.npy / features_test_epoch140.npy.
For certain DINO-v1/v2 backbone configurations, we provide diagnostic outputs (e.g., confusion-matrix visualizations) as performance summaries, while the corresponding full embedding artifacts are not included in the released package.

Citations

If you use this repository, please cite:

This Mag-DINO work (replace with your final paper citation once available).
DINO / DINOv2 foundational papers used for self-supervised representation learning.
TCGA-PRAD dataset reference (including DOI/official citation used in your manuscript).

@misc{magdino2026,
  title        = {Mag-DINO: Hierarchical Multi-Magnification Self-Supervised Learning for Prostate Cancer Gleason Grading},
  author       = {Yaren Buse and collaborators},
  year         = {2026},
  howpublished = {GitHub/Hugging Face repository},
  note         = {Research code and pretrained models}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support