Per-Pixel Classification is Not All You Need for Semantic Segmentation
Paper • 2107.06278 • Published
Cheng et al., 2021 — Per-Pixel Classification is Not All You Need for Semantic Segmentation (arXiv:2107.06278)
Lucid port of facebook/maskformer-resnet50-ade,
converted to Lucid-native safetensors.
| Tag | mIoU | Params | GFLOPs | Size | Source |
|---|---|---|---|---|---|
ADE20K (default) |
44.5 | 41.3M | — | 157.85 MB |
import lucid.models as models
from lucid.models.weights import MaskFormerResNet50Weights
# default tag
model = models.maskformer_resnet50(pretrained=True)
# explicit tag (enum or string)
model = models.maskformer_resnet50(weights=MaskFormerResNet50Weights.ADE20K)
model = models.maskformer_resnet50(pretrained="ADE20K")
# preprocessing travels with the weights
weights = MaskFormerResNet50Weights.ADE20K
preprocess = weights.transforms()
out = model(preprocess(image)[None])
# SemanticSegmentationOutput: per-pixel class logits (B, C, H, W)
seg = out.logits.argmax(axis=1) # (B, H, W) class indices
Converted from facebook/maskformer-resnet50-ade via
python -m tools.convert_weights maskformer_resnet50 --tag ADE20K.
Key mapping + numerical parity verified against the source.
other — inherited from the original weights.
@inproceedings{cheng2021maskformer,
title={Per-Pixel Classification is Not All You Need for Semantic Segmentation},
author={Cheng, Bowen and Schwing, Alexander G. and Kirillov, Alexander},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2021}
}