--- license: apache-2.0 datasets: - IGNF/FLAIR-HUB language: - en metrics: - mean_iou pipeline_tag: image-segmentation --- # ISDNet - Standalone PyTorch Implementation **ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation** CVPR 2022 | [Paper](https://openaccess.thecvf.com/content/CVPR2022/html/Guo_ISDNet_Integrating_Shallow_and_Deep_Networks_for_Efficient_Ultra-High_Resolution_CVPR_2022_paper.html) This is a **standalone PyTorch implementation** of ISDNet, without MMSegmentation dependencies, adapted from [ISDNet](https://github.com/cedricgsh/ISDNet). ## Features - Pure PyTorch implementation (no MMSeg required for training/inference) - Multi-GPU training with DistributedDataParallel - FLAIR French land cover dataset support (15 classes) - Modern Python packaging with `uv` - Modular code structure ## Model Architecture - **Backbone**: ResNet-18 (via timm) - **Shallow Path**: STDC-like module - **Deep Path**: ASPP with dilated convolutions - **Fusion**: Feature pyramid with lateral connections - **Parameters**: 17.76M - **FLOPs**: 21.79G @ 512x512 ## Installation ```bash # With uv (recommended) uv sync # Or with pip pip install -e . ``` ### Dependencies - Python >= 3.10 - PyTorch >= 2.0 - timm >= 0.9 - mmcv >= 2.0 (for ConvModule only) ## Project Structure ``` isdnet/ ├── __init__.py ├── config.py # Training configuration ├── models/ │ ├── __init__.py │ ├── isdnet.py # Main ISDNet model │ ├── modules.py # STDC blocks, Laplacian pyramid │ └── heads.py # ASPP, ISDHead, RefineASPPHead ├── datasets/ │ ├── __init__.py │ └── flair.py # FLAIR dataset class └── utils/ ├── __init__.py └── distributed.py # DDP utilities train.py # Training script inference.py # Evaluation script ``` ## Training Multi-GPU training on FLAIR dataset: ```bash uv run torchrun --nproc_per_node=4 train.py ``` Single GPU: ```bash uv run python train.py ``` ## Inference Evaluate on test set: ```bash uv run python inference.py --checkpoint isdnet_flair_best.pth uv run python inference.py --checkpoint isdnet_flair_best.pth --split valid ``` Configuration (in `isdnet/config.py`): - Batch size: 16 per GPU (64 total) - Learning rate: 1e-3 - Optimizer: SGD with momentum - Scheduler: PolynomialLR - Epochs: 80 - Crop size: 512x512 ## Usage ```python from isdnet import ISDNet, FLAIRDataset # Create model model = ISDNet( num_classes=15, backbone='resnet18', stdc_pretrain='STDCNet813M_73.91.tar' ).cuda() # Training forward outputs = model(images, return_loss=True) # Returns: out, out_deep, out_aux16, out_aux8, aux_out, losses_re, losses_fa # Inference forward predictions = model(images, return_loss=False) # Returns: (B, num_classes, H, W) logits ``` ## Results on FLAIR Dataset | Metric | Value | |--------|-------| | Val mIoU | 59.82% | | Test mIoU | 52.77% | | Pixel Accuracy | 72.02% | ### Per-class IoU (Test) | Class | IoU | |-------|-----| | water | 81.6% | | vineyard | 74.7% | | building | 72.0% | | deciduous | 66.4% | | impervious | 66.5% | | greenhouse | 61.3% | | bare soil | 56.2% | | coniferous | 55.1% | | agricultural | 53.3% | | snow | 51.9% | | pervious | 49.0% | | herbaceous | 46.0% | | plowed land | 33.4% | | brushwood | 24.0% | | swimming_pool | 0.0% | ## STDC Pretrained Weights Download STDC pretrained weights: [STDCNet813M_73.91.tar](https://drive.google.com/file/d/1FfG-qRlGy-2BsVjN2ZcKTEG9wZeo3sdW/view) ## Citation ```bibtex @inproceedings{guo2022isdnet, title={ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-High Resolution Segmentation}, author={Guo, Shaohua and Liu, Liang and Gan, Zhenye and Wang, Yabiao and Zhang, Wuhao and Wang, Chengjie and Jiang, Guannan and Zhang, Wei and Yi, Ran and Ma, Lizhuang and others}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={4361--4370}, year={2022} } ``` ## License Apache-2.0