File size: 4,100 Bytes
bebbee0 49d2955 bebbee0 49d2955 bebbee0 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | ---
license: apache-2.0
datasets:
- IGNF/FLAIR-HUB
language:
- en
metrics:
- mean_iou
pipeline_tag: image-segmentation
---
# ISDNet - Standalone PyTorch Implementation
**ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation**
CVPR 2022 | [Paper](https://openaccess.thecvf.com/content/CVPR2022/html/Guo_ISDNet_Integrating_Shallow_and_Deep_Networks_for_Efficient_Ultra-High_Resolution_CVPR_2022_paper.html)
This is a **standalone PyTorch implementation** of ISDNet, without MMSegmentation dependencies, adapted from [ISDNet](https://github.com/cedricgsh/ISDNet).
## Features
- Pure PyTorch implementation (no MMSeg required for training/inference)
- Multi-GPU training with DistributedDataParallel
- FLAIR French land cover dataset support (15 classes)
- Modern Python packaging with `uv`
- Modular code structure
## Model Architecture
- **Backbone**: ResNet-18 (via timm)
- **Shallow Path**: STDC-like module
- **Deep Path**: ASPP with dilated convolutions
- **Fusion**: Feature pyramid with lateral connections
- **Parameters**: 17.76M
- **FLOPs**: 21.79G @ 512x512
## Installation
```bash
# With uv (recommended)
uv sync
# Or with pip
pip install -e .
```
### Dependencies
- Python >= 3.10
- PyTorch >= 2.0
- timm >= 0.9
- mmcv >= 2.0 (for ConvModule only)
## Project Structure
```
isdnet/
βββ __init__.py
βββ config.py # Training configuration
βββ models/
β βββ __init__.py
β βββ isdnet.py # Main ISDNet model
β βββ modules.py # STDC blocks, Laplacian pyramid
β βββ heads.py # ASPP, ISDHead, RefineASPPHead
βββ datasets/
β βββ __init__.py
β βββ flair.py # FLAIR dataset class
βββ utils/
βββ __init__.py
βββ distributed.py # DDP utilities
train.py # Training script
inference.py # Evaluation script
```
## Training
Multi-GPU training on FLAIR dataset:
```bash
uv run torchrun --nproc_per_node=4 train.py
```
Single GPU:
```bash
uv run python train.py
```
## Inference
Evaluate on test set:
```bash
uv run python inference.py --checkpoint isdnet_flair_best.pth
uv run python inference.py --checkpoint isdnet_flair_best.pth --split valid
```
Configuration (in `isdnet/config.py`):
- Batch size: 16 per GPU (64 total)
- Learning rate: 1e-3
- Optimizer: SGD with momentum
- Scheduler: PolynomialLR
- Epochs: 80
- Crop size: 512x512
## Usage
```python
from isdnet import ISDNet, FLAIRDataset
# Create model
model = ISDNet(
num_classes=15,
backbone='resnet18',
stdc_pretrain='STDCNet813M_73.91.tar'
).cuda()
# Training forward
outputs = model(images, return_loss=True)
# Returns: out, out_deep, out_aux16, out_aux8, aux_out, losses_re, losses_fa
# Inference forward
predictions = model(images, return_loss=False)
# Returns: (B, num_classes, H, W) logits
```
## Results on FLAIR Dataset
| Metric | Value |
|--------|-------|
| Val mIoU | 59.82% |
| Test mIoU | 52.77% |
| Pixel Accuracy | 72.02% |
### Per-class IoU (Test)
| Class | IoU |
|-------|-----|
| water | 81.6% |
| vineyard | 74.7% |
| building | 72.0% |
| deciduous | 66.4% |
| impervious | 66.5% |
| greenhouse | 61.3% |
| bare soil | 56.2% |
| coniferous | 55.1% |
| agricultural | 53.3% |
| snow | 51.9% |
| pervious | 49.0% |
| herbaceous | 46.0% |
| plowed land | 33.4% |
| brushwood | 24.0% |
| swimming_pool | 0.0% |
## STDC Pretrained Weights
Download STDC pretrained weights: [STDCNet813M_73.91.tar](https://drive.google.com/file/d/1FfG-qRlGy-2BsVjN2ZcKTEG9wZeo3sdW/view)
## Citation
```bibtex
@inproceedings{guo2022isdnet,
title={ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-High Resolution Segmentation},
author={Guo, Shaohua and Liu, Liang and Gan, Zhenye and Wang, Yabiao and Zhang, Wuhao and Wang, Chengjie and Jiang, Guannan and Zhang, Wei and Yi, Ran and Ma, Lizhuang and others},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4361--4370},
year={2022}
}
```
## License
Apache-2.0 |