ISDNet - Standalone PyTorch Implementation

ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation

CVPR 2022 | Paper

This is a standalone PyTorch implementation of ISDNet, without MMSegmentation dependencies, adapted from ISDNet.

Features

Pure PyTorch implementation (no MMSeg required for training/inference)
Multi-GPU training with DistributedDataParallel
FLAIR French land cover dataset support (15 classes)
Modern Python packaging with uv
Modular code structure

Model Architecture

Backbone: ResNet-18 (via timm)
Shallow Path: STDC-like module
Deep Path: ASPP with dilated convolutions
Fusion: Feature pyramid with lateral connections
Parameters: 17.76M
FLOPs: 21.79G @ 512x512

Installation

# With uv (recommended)
uv sync

# Or with pip
pip install -e .

Dependencies

Python >= 3.10
PyTorch >= 2.0
timm >= 0.9
mmcv >= 2.0 (for ConvModule only)

Project Structure

isdnet/
├── __init__.py
├── config.py           # Training configuration
├── models/
│   ├── __init__.py
│   ├── isdnet.py       # Main ISDNet model
│   ├── modules.py      # STDC blocks, Laplacian pyramid
│   └── heads.py        # ASPP, ISDHead, RefineASPPHead
├── datasets/
│   ├── __init__.py
│   └── flair.py        # FLAIR dataset class
└── utils/
    ├── __init__.py
    └── distributed.py  # DDP utilities
train.py                # Training script
inference.py            # Evaluation script

Training

Multi-GPU training on FLAIR dataset:

uv run torchrun --nproc_per_node=4 train.py

Single GPU:

uv run python train.py

Inference

Evaluate on test set:

uv run python inference.py --checkpoint isdnet_flair_best.pth
uv run python inference.py --checkpoint isdnet_flair_best.pth --split valid

Configuration (in isdnet/config.py):

Batch size: 16 per GPU (64 total)
Learning rate: 1e-3
Optimizer: SGD with momentum
Scheduler: PolynomialLR
Epochs: 80
Crop size: 512x512

Usage

from isdnet import ISDNet, FLAIRDataset

# Create model
model = ISDNet(
    num_classes=15,
    backbone='resnet18',
    stdc_pretrain='STDCNet813M_73.91.tar'
).cuda()

# Training forward
outputs = model(images, return_loss=True)
# Returns: out, out_deep, out_aux16, out_aux8, aux_out, losses_re, losses_fa

# Inference forward
predictions = model(images, return_loss=False)
# Returns: (B, num_classes, H, W) logits

Results on FLAIR Dataset

Metric	Value
Val mIoU	59.82%
Test mIoU	52.77%
Pixel Accuracy	72.02%

Per-class IoU (Test)

Class	IoU
water	81.6%
vineyard	74.7%
building	72.0%
deciduous	66.4%
impervious	66.5%
greenhouse	61.3%
bare soil	56.2%
coniferous	55.1%
agricultural	53.3%
snow	51.9%
pervious	49.0%
herbaceous	46.0%
plowed land	33.4%
brushwood	24.0%
swimming_pool	0.0%

STDC Pretrained Weights

Download STDC pretrained weights: STDCNet813M_73.91.tar

Citation

@inproceedings{guo2022isdnet,
  title={ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-High Resolution Segmentation},
  author={Guo, Shaohua and Liu, Liang and Gan, Zhenye and Wang, Yabiao and Zhang, Wuhao and Wang, Chengjie and Jiang, Guannan and Zhang, Wei and Yi, Ran and Ma, Lizhuang and others},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4361--4370},
  year={2022}
}

License

Apache-2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

heig-vd-geo
/

ISDNet-pytorch