Mamba-Segmentation
Controlled Visual State-Space Backbone Benchmark with Domain-Shift & Boundary Analysis for Remote-Sensing Segmentation
Accepted at IGARSS 2026
One pipeline. One decoder. One loss. One schedule. Five backbone families. The only variable is the encoder — so the results finally mean something.
What Is This?
Remote-sensing segmentation papers routinely change the backbone and the decoder and the loss and the training schedule all at once. The numbers tell you who tuned harder, not which backbone is better.
This repo fixes that. One shared pipeline — swap the backbone — read the truth.
| Component | Status |
|---|---|
| Encoder backbone | 🔀 Swapped per experiment — the ONLY variable |
| Decoder | 🔒 Fixed (lightweight U-Net, 256ch, MambaBlock2d) |
| Loss | 🔒 Fixed (Lovász-Softmax + Focal + Boundary) |
| Training schedule | 🔒 Fixed (50k iters, AdamW, poly LR decay) |
| Augmentations | 🔒 Fixed (random crop, flip, colour jitter) |
| Input resolution | 🔒 Fixed (512×512) |
| Feature interface | 🔒 Fixed ({F1–F4} at strides {4, 8, 16, 32}) |
Checkpoints in This Repository
All checkpoints are best.pth files (highest validation mIoU during training) stored with their original directory structure.
LoveDA Experiments — Comparison_Experiments/
MambaVision (NVIDIA hybrid Mamba-Transformer)
| Checkpoint path | Training split |
|---|---|
Comparison_Experiments/mambavision_tiny_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/mambavision_tiny_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/mambavision_tiny_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/mambavision_tiny2_512/checkpoints/best.pth |
All→All (v2) |
Comparison_Experiments/mambavision_tiny2_ruraltrain_512/checkpoints/best.pth |
Rural→Urban (v2) |
Comparison_Experiments/mambavision_tiny2_urbantrain_512/checkpoints/best.pth |
Urban→Rural (v2) |
Comparison_Experiments/mambavision_small_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/mambavision_small_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/mambavision_small_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/mambavision_base_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/mambavision_base_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/mambavision_base_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/mambavision_large_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/mambavision_large_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/mambavision_large_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/mambavision_large2_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/mambavision_large2_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/mambavision_large2_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
VMamba (cross-scan 2D selective SSM)
| Checkpoint path | Training split |
|---|---|
Comparison_Experiments/Vmamb_tiny_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/vmamba_tiny_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/vmamba_tiny_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/Vmamb_small_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/Vmamb_small_512_2/checkpoints/best.pth |
All→All (run 2) |
Comparison_Experiments/Vmamb_small_512_3/checkpoints/best.pth |
All→All (run 3) |
Comparison_Experiments/vmamba_small_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/vmamba_small_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/Vmamb_base_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/vmamba_base_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/vmamba_base_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
VisionMamba / Vim (bidirectional Mamba)
| Checkpoint path | Training split |
|---|---|
Comparison_Experiments/VisionMamba_tiny_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/visionmamba_tiny_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/visionmamba_tiny_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/VisionMamba_small_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/visionmamba_small_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/visionmamba_small_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/VisionMamba_base_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/visionmamba_base_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/visionmamba_base_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Spatial-Mamba (spatially-aware SSM)
| Checkpoint path | Training split |
|---|---|
Comparison_Experiments/spatialmamba_tiny_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/spatialmamba_tiny_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/spatialmamba_tiny_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/spatialmamba_small_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/spatialmamba_small_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/spatialmamba_small_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
Comparison_Experiments/spatialmamba_base_512/checkpoints/best.pth |
All→All |
Comparison_Experiments/spatialmamba_base_ruraltrain_512/checkpoints/best.pth |
Rural→Urban |
Comparison_Experiments/spatialmamba_base_urbantrain_512/checkpoints/best.pth |
Urban→Rural |
CNN & Transformer Baselines
| Checkpoint path | Model |
|---|---|
Comparison_Experiments/cnn_deeplabv3p_r50_512/checkpoints/best.pth |
DeepLabv3+ ResNet-50, All→All |
Comparison_Experiments/cnn_deeplabv3p_resnet50_ruraltrain_512/checkpoints/best.pth |
DeepLabv3+ ResNet-50, Rural→Urban |
Comparison_Experiments/cnn_deeplabv3p_resnet50_urbantrain_512/checkpoints/best.pth |
DeepLabv3+ ResNet-50, Urban→Rural |
Comparison_Experiments/cnn_unet_r50_512/checkpoints/best.pth |
U-Net ResNet-50, All→All |
Comparison_Experiments/transformer_unetformer_r18_512/checkpoints/best.pth |
UNetFormer ResNet-18, All→All |
Comparison_Experiments/transformerunetformer_resnet18_ruraltrain_512/checkpoints/best.pth |
UNetFormer ResNet-18, Rural→Urban |
Comparison_Experiments/transformerunetformer_resnet18_urbantrain_512/checkpoints/best.pth |
UNetFormer ResNet-18, Urban→Rural |
ISPRS Potsdam Experiments — Comparison_Experiments_ICPRS_potsdam/
| Checkpoint path | Model |
|---|---|
Comparison_Experiments_ICPRS_potsdam/mambavision_tiny_512/checkpoints/best.pth |
MambaVision-Tiny |
Comparison_Experiments_ICPRS_potsdam/mambavision_tiny2_512/checkpoints/best.pth |
MambaVision-Tiny2 |
Comparison_Experiments_ICPRS_potsdam/mambavision_small_512/checkpoints/best.pth |
MambaVision-Small |
Comparison_Experiments_ICPRS_potsdam/mambavision_base_512/checkpoints/best.pth |
MambaVision-Base |
Comparison_Experiments_ICPRS_potsdam/mambavision_large_512/checkpoints/best.pth |
MambaVision-Large |
Comparison_Experiments_ICPRS_potsdam/mambavision_large2_512/checkpoints/best.pth |
MambaVision-Large2 |
Comparison_Experiments_ICPRS_potsdam/vmamba_tiny_512/checkpoints/best.pth |
VMamba-Tiny |
Comparison_Experiments_ICPRS_potsdam/vmamba_small_512/checkpoints/best.pth |
VMamba-Small |
Comparison_Experiments_ICPRS_potsdam/vmamba_base_512/checkpoints/best.pth |
VMamba-Base |
Comparison_Experiments_ICPRS_potsdam/spatialmamba_tiny_512/checkpoints/best.pth |
Spatial-Mamba-Tiny |
Comparison_Experiments_ICPRS_potsdam/spatialmamba_small_512/checkpoints/best.pth |
Spatial-Mamba-Small |
Comparison_Experiments_ICPRS_potsdam/spatialmamba_base_512/checkpoints/best.pth |
Spatial-Mamba-Base |
Comparison_Experiments_ICPRS_potsdam/cnn_deeplabv3p_r50_512/checkpoints/best.pth |
DeepLabv3+ ResNet-50 |
Comparison_Experiments_ICPRS_potsdam/transformer_unetformer_r18_512/checkpoints/best.pth |
UNetFormer ResNet-18 |
ImageNet Backbone Weights — weights/imagenet/
| File | Description |
|---|---|
weights/imagenet/resnet50-11ad3fa6.pth |
ResNet-50 ImageNet-1K pretrained |
weights/imagenet/resnet18-f37072fd.pth |
ResNet-18 ImageNet-1K pretrained |
Results Summary
Every row shares the same decoder, loss, optimizer, schedule, and data splits. The only variable is the encoder.
LoveDA
| Backbone | mIoU (All→All) | mIoU (U→R) | mIoU (R→U) |
|---|---|---|---|
| DeepLabv3+ ResNet-50 (CNN) | 43.01 | 30.36 | 39.98 |
| UNetFormer ResNet-18 (Transformer) | 48.61 | 34.56 | 44.84 |
| VMamba-Small 🥇 | 55.66 | 40.62 | 53.52 |
| MambaVision-Large | 55.25 | 38.53 | 54.01 |
| Spatial-Mamba-Base | 48.03 | 35.23 | 46.55 |
ISPRS Potsdam
| Backbone | mIoU |
|---|---|
| DeepLabv3+ ResNet-50 | 75.09 |
| UNetFormer ResNet-18 | 74.99 |
| VMamba-Small 🥇 | 77.59 |
| MambaVision-Large | 77.07 |
| Spatial-Mamba-Base | 70.00 |
Key findings:
- SSMs outperform CNNs and Transformers by a significant margin under identical conditions (+7–12 mIoU on LoveDA).
- Scaling the encoder past VMamba-Small yields diminishing returns under a fixed decoder.
- Domain transfer is asymmetric across all backbone families (Rural→Urban consistently outperforms Urban→Rural by 10–15 points) — a data distribution property, not a model property.
- Boundary accuracy collapses under domain shift while interior accuracy holds — every backbone, every family.
How to Load a Checkpoint
import torch
# Example: load MambaVision-Base best checkpoint for LoveDA All→All
ckpt = torch.load(
"Comparison_Experiments/mambavision_base_512/checkpoints/best.pth",
map_location="cpu"
)
# keys: 'model', 'optimizer', 'scheduler', 'iter', 'best_score'
model_state = ckpt["model"]
To build the full model and run inference, clone the code repository and follow the setup instructions:
git clone https://github.com/dineth18/Mamba-Segmentation
cd Mamba-Segmentation/MambaVision # or VMamba/, spatial-mamba/, etc.
pip install -r requirements.txt
# Set your dataset path (no need to edit config files)
export LOVEDA_ROOT=/path/to/LoveDA
export POTSDAM_ROOT=/path/to/ISPRS_Potsdam
python eval.py --checkpoint path/to/best.pth
Citation
If this benchmark is useful for your research, please cite:
@article{wasalathilaka2026controlledbenchmark,
title={A Controlled Benchmark of Visual State-Space Backbones with
Domain-Shift and Boundary Analysis for Remote-Sensing Segmentation},
author={Wasalathilaka, Nichula and Perea, Dineth and Samarakoon, Oshadha
and Wijenayake, Buddhi and Godaliyadda, Roshan and Herath, Vijitha
and Ekanayake, Parakrama},
journal={IGARSS 2026},
year={2026}
}
Acknowledgements
- VMamba — Visual State Space Model
- MambaVision — NVIDIA hybrid Mamba-Transformer
- Spatial-Mamba — Spatially-aware Mamba
- LoveDA — Land-cover domain adaptation dataset
- ISPRS Potsdam — Urban semantic labeling benchmark
Built at the University of Peradeniya.