Update README.md

bc36c8c verified about 1 month ago

2.69 kB

language:
  - en
license: apache-2.0
tags:
  - earth-observation
  - segmentation
  - unet
  - pytorch
  - remote-sensing
  - spacenet
datasets:
  - harshinde/spacenet-rio
metrics:
  - iou
  - accuracy
  - dice
pipeline_tag: image-segmentation

SpaceNet Rio Building Detection Model

This model detects building footprints from high-resolution satellite imagery. It is a PyTorch-based U-Net model trained on the SpaceNet (Rio de Janeiro) dataset for semantic segmentation (binary: background vs. building).

Model Details

Architecture: U-Net with residual connections, 4 encoder/decoder levels, 10% spatial dropout, and 1024-channel bottleneck.
Task: Semantic Segmentation (Building Footprint Extraction)
Input: 3-band (RGB) pan-sharpened GeoTIFFs (dynamic architecture also supports 8-band multispectral).
Output: Binary mask (0: background, 1: building).
Parameters: ~31M (Kaiming He initialized)
Framework: PyTorch

Uses

Direct Use

This model can be used to automatically detect and extract building footprint masks from satellite imagery. It is primarily designed for high-resolution (e.g., ~50cm/pixel) RGB satellite tiles.

Out-of-Scope Use

General object detection (e.g., cars, roads).
Imagery with completely different spatial resolutions (e.g., 30m Landsat data) without fine-tuning.

Training Details

Dataset

Trained on the SpaceNet Rio de Janeiro dataset.

Total Tiles: 6,940
Split: 7:1:2 (Train: 4,857 | Val: 693 | Test: 1,387)

Hyperparameters

Epochs: 100 (with early stopping patience of 15)
Batch Size: 16 (Train) / 4 (Val)
Learning Rate: 0.001 with 5 warmup epochs
Weight Decay: 0.0001
Loss Function: Combined Dice Loss (weight 1.0) + Cross-Entropy Loss (weight 1.0)
Image Crops: 400x400 (Train) / 480x480 (Val)

Training Metrics

Training metrics were tracked using TensorBoard and include:

Training/Validation Loss
Mean IoU and Per-Class IoU
Pixel Accuracy

You can view the full training logs and curves here on TensorBoard.

How to Get Started with the Model

You can load the weights using PyTorch:

import torch

# Assuming the U-Net architecture is defined in your local code
# model = UNet(in_channels=3, num_classes=2)

checkpoint = torch.load("best_model.pt", map_location="cpu")

# Depending on how the state dict was saved, load it into the model
# model.load_state_dict(checkpoint['model_state_dict']) # if saved as a dictionary
# OR
# model.load_state_dict(checkpoint) # if saved as raw state_dict

model.eval()