File size: 4,100 Bytes

---
license: apache-2.0
datasets:
- IGNF/FLAIR-HUB
language:
- en
metrics:
- mean_iou
pipeline_tag: image-segmentation
---
# ISDNet - Standalone PyTorch Implementation

**ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation**

CVPR 2022 | [Paper](https://openaccess.thecvf.com/content/CVPR2022/html/Guo_ISDNet_Integrating_Shallow_and_Deep_Networks_for_Efficient_Ultra-High_Resolution_CVPR_2022_paper.html)

This is a **standalone PyTorch implementation** of ISDNet, without MMSegmentation dependencies, adapted from [ISDNet](https://github.com/cedricgsh/ISDNet).

## Features

- Pure PyTorch implementation (no MMSeg required for training/inference)
- Multi-GPU training with DistributedDataParallel
- FLAIR French land cover dataset support (15 classes)
- Modern Python packaging with `uv`
- Modular code structure

## Model Architecture

- **Backbone**: ResNet-18 (via timm)
- **Shallow Path**: STDC-like module
- **Deep Path**: ASPP with dilated convolutions
- **Fusion**: Feature pyramid with lateral connections
- **Parameters**: 17.76M
- **FLOPs**: 21.79G @ 512x512

## Installation

```bash
# With uv (recommended)
uv sync

# Or with pip
pip install -e .
```

### Dependencies

- Python >= 3.10
- PyTorch >= 2.0
- timm >= 0.9
- mmcv >= 2.0 (for ConvModule only)

## Project Structure

```
isdnet/
├── __init__.py
├── config.py           # Training configuration
├── models/
│   ├── __init__.py
│   ├── isdnet.py       # Main ISDNet model
│   ├── modules.py      # STDC blocks, Laplacian pyramid
│   └── heads.py        # ASPP, ISDHead, RefineASPPHead
├── datasets/
│   ├── __init__.py
│   └── flair.py        # FLAIR dataset class
└── utils/
    ├── __init__.py
    └── distributed.py  # DDP utilities
train.py                # Training script
inference.py            # Evaluation script
```

## Training

Multi-GPU training on FLAIR dataset:

```bash
uv run torchrun --nproc_per_node=4 train.py
```

Single GPU:

```bash
uv run python train.py
```

## Inference

Evaluate on test set:

```bash
uv run python inference.py --checkpoint isdnet_flair_best.pth
uv run python inference.py --checkpoint isdnet_flair_best.pth --split valid
```

Configuration (in `isdnet/config.py`):
- Batch size: 16 per GPU (64 total)
- Learning rate: 1e-3
- Optimizer: SGD with momentum
- Scheduler: PolynomialLR
- Epochs: 80
- Crop size: 512x512

## Usage

```python
from isdnet import ISDNet, FLAIRDataset

# Create model
model = ISDNet(
    num_classes=15,
    backbone='resnet18',
    stdc_pretrain='STDCNet813M_73.91.tar'
).cuda()

# Training forward
outputs = model(images, return_loss=True)
# Returns: out, out_deep, out_aux16, out_aux8, aux_out, losses_re, losses_fa

# Inference forward
predictions = model(images, return_loss=False)
# Returns: (B, num_classes, H, W) logits
```

## Results on FLAIR Dataset

| Metric | Value |
|--------|-------|
| Val mIoU | 59.82% |
| Test mIoU | 52.77% |
| Pixel Accuracy | 72.02% |

### Per-class IoU (Test)

| Class | IoU |
|-------|-----|
| water | 81.6% |
| vineyard | 74.7% |
| building | 72.0% |
| deciduous | 66.4% |
| impervious | 66.5% |
| greenhouse | 61.3% |
| bare soil | 56.2% |
| coniferous | 55.1% |
| agricultural | 53.3% |
| snow | 51.9% |
| pervious | 49.0% |
| herbaceous | 46.0% |
| plowed land | 33.4% |
| brushwood | 24.0% |
| swimming_pool | 0.0% |

## STDC Pretrained Weights

Download STDC pretrained weights: [STDCNet813M_73.91.tar](https://drive.google.com/file/d/1FfG-qRlGy-2BsVjN2ZcKTEG9wZeo3sdW/view)

## Citation

```bibtex
@inproceedings{guo2022isdnet,
  title={ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-High Resolution Segmentation},
  author={Guo, Shaohua and Liu, Liang and Gan, Zhenye and Wang, Yabiao and Zhang, Wuhao and Wang, Chengjie and Jiang, Guannan and Zhang, Wei and Yi, Ran and Ma, Lizhuang and others},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4361--4370},
  year={2022}
}
```

## License

Apache-2.0