NKI-AI/tissue-bg-all-stains

A small tissue / background segmentation model for histopathology whole-slide images (WSIs). Trained to be stain-agnostic ("all stains") so the same checkpoint can produce tissue masks across H&E and a range of IHC stainings.

The model is consumed by the dcis-biomarkers pipeline (and other downstream workflows) as a one-shot preprocessing step: it produces the foreground masks that gate tile-based feature extraction so the encoder only sees real tissue rather than empty glass, ink, or scanner artefacts.

Status: This artefact ships ahead of a standalone public release of the inference engine (aifocore). The runtime code path needed to use the model is currently vendored inside dcis-biomarkers (the pure-Python aifocore.inference subtree). A standalone aifocore PyPI package will follow.

Model details


Architecture	MONAI U-Net (5-level encoder/decoder, 3×3 conv + ADN blocks)
Parameters	~7.94 M
Format	TorchScript JIT, packaged as a `.pack` (zip of `model.pt` + `model_config.xml`)
Inputs	RGB tiles, 1024×1024 px at 12.0 µm/px, no per-channel normalisation (raw `tile/255.0` is expected by the bundled inference engine)
Tile overlap	128 px on both axes, "crop" merge across overlaps
Outputs	2 logits per pixel → argmax → class index map
Classes	`0` = Background (blue, `#0000ff`); `1` = Tissue (gray, `#808080`)
Asset size	~32 MB (single `.pack`)
Built	2025-12-17

The relatively coarse inference MPP (12 µm/px) is deliberate: tissue/background is a low-frequency decision that doesn't benefit from cellular-scale detail, and running at coarse resolution keeps mask generation a small fraction of the cost of downstream feature extraction.

Intended use

Primary use: generate a binary tissue mask for every WSI in a cohort, to be consumed by tile-level pipelines that need to restrict sampling to foreground regions.
Out-of-scope: anything that needs sub-tissue semantic structure (epithelium vs. stroma, tumour vs. non-tumour, ROI classification, etc.). This model only separates tissue from non-tissue.

How to use

The model is consumed via the inference engine bundled with dcis-biomarkers:

# 1. Pull the .pack from this repository.
python scripts/download_segmentation_assets.py \
  --repo-id NKI-AI/tissue-bg-all-stains \
  --output-dir <models_dir>

# 2. Generate masks for every WSI under a tree.
python scripts/generate_tissue_masks.py \
  --input-dir <wsi_root> \
  --output-dir <masks_root> \
  --model-path <models_dir>/tissue_background.pack

Output layout: <masks_root>/<rel>/<stem>.tiff (pyramidal multi-class TIFF) + <stem>.thumbnail.png (sidecar overlay for visual QC).

Under the hood:

import torch
from aifocore.model_loader import ModelPackage
from aifocore.segmentation import ProcessImage

model_package = ModelPackage.from_zip(
    "tissue_background_all_stains.pack",
    device=torch.device("cuda"),
)
ProcessImage(
    model_package=model_package,
    image_file="slide.svs",
    output_file="mask.tiff",
    create_thumbnail=True,
).infer_and_save()

Training data

Evaluation

Limitations and biases

"All stains" covers the stain mix in the training set (see Training data once filled in); applying the model to stains or preparations far outside that distribution may degrade quality. Visual QC of the thumbnail is recommended on any new cohort before relying on the masks downstream.
The model is purely RGB-pixel-driven and has no awareness of scanner-level metadata. Pen marks, coverslip artefacts, and out-of-focus regions are treated as ordinary input pixels and may be classified as tissue.

Citation

Model authors and maintainers

Model author: Jonas Teuwen (model training and packaging)
Maintainer: NKI-AI (Netherlands Cancer Institute — AI for Oncology group)

License

Apache 2.0 — see the LICENSE of the consuming repository.

Downloads last month: -; Downloads are not tracked for this model. How to track