MASS Base Checkpoint

This repository hosts mass_base.pth, the base checkpoint for MASS: Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision.

MASS is a mask-guided self-supervised learning framework for 3D medical images. The released checkpoint was trained with the data used in our paper and the Iris in-context segmentation architecture. It uses automatically generated class-agnostic masks for pretraining and does not use expert ground-truth annotations during pretraining.

What This Checkpoint Is For

mass_base.pth can be used with the official MASS codebase for:

training-free in-context segmentation with reference image-mask examples;
initialization for downstream segmentation finetuning;
frozen-encoder or finetuned encoder classification experiments.

This is a PyTorch checkpoint for the MASS/Iris architecture, not a standalone Transformers model. Please use it with the code release:

GitHub: https://github.com/Stanford-AIMI/MASS
Project page: https://yhygao.github.io/MASS_page/
Paper: https://arxiv.org/abs/2603.13660

Download

Using the Hugging Face CLI:

hf download StanfordAIMI/MASS mass_base.pth --local-dir checkpoints

Using Python:

from huggingface_hub import hf_hub_download

checkpoint_path = hf_hub_download("StanfordAIMI/MASS", "mass_base.pth")

Raw NIfTI In-Context Inference

python inference.py \
  --checkpoint checkpoints/mass_base.pth \
  --test-image /path/to/test_image.nii.gz \
  --reference-image /path/to/reference_image.nii.gz \
  --reference-mask /path/to/reference_mask.nii.gz \
  --output outputs/test_image_seg.nii.gz \
  --gpu 0 \
  --use-ema \
  --modality ct \
  --orientation RAS \
  --target-spacing 1.5 1.5 1.5 \
  --window-size 128 128 128 \
  --overlap 0.5

Please make sure the input NIfTI metadata is complete and reliable, especially orientation and spacing. mass_base.pth was trained after standardizing images to RAS orientation, so using --orientation RAS is recommended.

Downstream Segmentation Finetuning

python train.py \
  --config config/downstream/segmentation_finetune_example.yaml \
  --gpu 0 \
  --name segmentation_finetune_example \
  --override \
    finetuning.pretrained_checkpoint=checkpoints/mass_base.pth \
    data.train.data_root=/path/to/mass_h5 \
    data.val.data_root=/path/to/mass_h5 \
    data.train.datasets='[example_segmentation]' \
    data.val.datasets='[example_segmentation]'

Classification Linear Probing

python train.py \
  --config config/downstream/classification_linear_probe_example.yaml \
  --gpu 0 \
  --name classification_linear_probe_example \
  --override \
    classification.encoder.pretrained_checkpoint=checkpoints/mass_base.pth \
    classification.num_classes=2 \
    data.train.data_root=/path/to/classification_data \
    data.val.data_root=/path/to/classification_data \
    data.train.datasets='[example_classification]' \
    data.val.datasets='[example_classification]'

Training Details

Architecture: Iris in-context segmentation architecture.
Pretraining objective: MASS mask-guided self-supervised learning.
Supervision during pretraining: automatically generated class-agnostic masks.
Expert annotations during pretraining: none.
Modalities: 3D CT, MRI, and PET volumes used in the MASS paper.

The MASS objective is compatible with other in-context segmentation architectures. The official codebase includes preprocessing and pretraining utilities for training MASS on your own data.

Limitations

This checkpoint is intended for research use.
It is not a medical device and should not be used for clinical decision-making.
Raw NIfTI inference depends on reliable image metadata and preprocessing choices. Cases with missing or incorrect spacing/orientation metadata should be inspected carefully.
Task-specific finetuning or validation is recommended before using the model on a new dataset or anatomy.

Citation

@article{gao2026learning,
  title={Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision},
  author={Gao, Yunhe and Zhang, Yabin and Wang, Chong and Liu, Jiaming and Varma, Maya and Delbrouck, Jean-Benoit and Chaudhari, Akshay and Langlotz, Curtis},
  journal={arXiv preprint arXiv:2603.13660},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for StanfordAIMI/MASS

Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision

Paper • 2603.13660 • Published Mar 14