CoralBay: A Self-Supervised CT Foundation Model


Quick Highlights

  • True 3D CT: Moves beyond 2D slice models to understand full volumetric anatomy and spatial relationships.
  • Data Efficient: Top-tier classification and segmentation performance from just 11K unlabeled CT scans.
  • One Model, Many Tasks: From fine-grained analysis to global understanding and downstream applications.
  • Open & Ready for Impact: Open-source, CT task robust, and built to accelerate the future of medical AI.

Quick Start

import urllib.request
import torch

from monai import transforms, inferers
from transformers import AutoModel

# 1. Download sample CT scan
url = "https://github.com/neurolabusc/niivue-images/raw/refs/heads/main/CT_Abdo.nii.gz"
urllib.request.urlretrieve(url, "CT_Abdo.nii.gz")

# 2. Preprocess volume
preprocess = transforms.Compose([
    transforms.LoadImage(image_only=True),
    transforms.EnsureChannelFirst(),

    transforms.Spacing(
        pixdim=(1.5, 1.5, 1.5),
        mode="bilinear"
    ),

    # dev-only crop for speed
    transforms.CenterSpatialCrop(roi_size=192),

    # any HU window (a_min, a_max) will do
    transforms.ScaleIntensityRange(
        a_min=-1000, a_max=1000,
        b_min=0.0, b_max=1.0,
        clip=True,
    ),
])

x = preprocess("CT_Abdo.nii.gz").float().unsqueeze(0)  # (1, 1, D, H, W)

# 3. Load model
device = "cuda" if torch.cuda.is_available() else "cpu"

model = AutoModel.from_pretrained(
    "kaiko-ai-user/coralbay",
    trust_remote_code=True,
    out_indices=None,
    # out_indices=6,  # = None for the aggregated feature vector
).eval().to(device)

# 4. Attach sliding-window inference
model.encoder._inferer = inferers.SlidingWindowInferer(
    roi_size=(96, 96, 96),
    sw_batch_size=2,
    overlap=0.0,        # 0.75 for better results
    mode="gaussian",
    sw_device=device,   # run windows on GPU
    device="cpu",       # stitch results on CPU
)

# 5. Run inference
with torch.no_grad():
    features = model(x.to(device))

# 6. Output
# for `out_indices=6`:
#   features is a multi-scale feature pyramid:
#       [0] (1, 192, 96, 96, 96)
#       [1] (1, 192, 48, 48, 48)
#       [2] (1, 384, 24, 24, 24)
#       [3] (1, 768, 12, 12, 12)
#       [4] (1, 1536, 6, 6, 6)
#       [5] (1, 3072, 3, 3, 3)
# for `out_indices=None`:
#   features is a tensor 1, 4608)

Quantitative Performance

Quantitative performance across classification (Multiclass Accuracy/Binary AUROC) and segmentation (Dice score) tasks, as evaluated via the eva framework.

Radiology Leaderboard

Citation

If you use this model, please cite it as follows:

@misc{gatopoulos2026coralbayselfsupervisedctfoundation,
      title={CoralBay: A Self-Supervised CT Foundation Model}, 
      author={Ioannis Gatopoulos and Nicolas Känzig and Sebastian Otálora and Fei Tang},
      year={2026},
      eprint={2606.03888},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.03888}, 
}

Downloads last month
-
Safetensors
Model size
0.8B params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for kaiko-ai/coralbay