MaskFormer (ResNet-50)

Cheng et al., 2021 — Per-Pixel Classification is Not All You Need for Semantic Segmentation (arXiv:2107.06278)

Lucid port of facebook/maskformer-resnet50-ade, converted to Lucid-native safetensors.

Available weights

Tag	mIoU	Params	GFLOPs	Size	Source
`ADE20K` (default)	44.5	41.3M	—	157.85 MB	facebook

Usage

import lucid.models as models
from lucid.models.weights import MaskFormerResNet50Weights

# default tag
model = models.maskformer_resnet50(pretrained=True)

# explicit tag (enum or string)
model = models.maskformer_resnet50(weights=MaskFormerResNet50Weights.ADE20K)
model = models.maskformer_resnet50(pretrained="ADE20K")

# preprocessing travels with the weights
weights = MaskFormerResNet50Weights.ADE20K
preprocess = weights.transforms()
out = model(preprocess(image)[None])
# SemanticSegmentationOutput: per-pixel class logits (B, C, H, W)
seg = out.logits.argmax(axis=1)  # (B, H, W) class indices

Conversion

Converted from facebook/maskformer-resnet50-ade via python -m tools.convert_weights maskformer_resnet50 --tag ADE20K. Key mapping + numerical parity verified against the source.

License

other — inherited from the original weights.

Citation

@inproceedings{cheng2021maskformer,
  title={Per-Pixel Classification is Not All You Need for Semantic Segmentation},
  author={Cheng, Bowen and Schwing, Alexander G. and Kirillov, Alexander},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2021}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for lucid-dl/maskformer-resnet-50

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Paper • 2107.06278 • Published Jul 13, 2021

Evaluation results

mIoU on ADE20K
self-reported

44.500