harshinde
/

spacenet-models

Image Segmentation

earth-observation

Model card Files Files and versions

Metrics Training metrics Community

spacenet-models / README.md

harshinde's picture

Update README.md

bc36c8c verified about 1 month ago

|

history blame contribute delete

2.69 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- earth-observation
	- segmentation
	- unet
	- pytorch
	- remote-sensing
	- spacenet
	datasets:
	- harshinde/spacenet-rio
	metrics:
	- iou
	- accuracy
	- dice
	pipeline_tag: image-segmentation
	---

	# SpaceNet Rio Building Detection Model

	This model detects building footprints from high-resolution satellite imagery. It is a PyTorch-based U-Net model trained on the SpaceNet (Rio de Janeiro) dataset for semantic segmentation (binary: background vs. building).

	## Model Details

	- Architecture: U-Net with residual connections, 4 encoder/decoder levels, 10% spatial dropout, and 1024-channel bottleneck.
	- Task: Semantic Segmentation (Building Footprint Extraction)
	- Input: 3-band (RGB) pan-sharpened GeoTIFFs (dynamic architecture also supports 8-band multispectral).
	- Output: Binary mask (0: background, 1: building).
	- Parameters: ~31M (Kaiming He initialized)
	- Framework: PyTorch

	## Uses

	### Direct Use
	This model can be used to automatically detect and extract building footprint masks from satellite imagery. It is primarily designed for high-resolution (e.g., ~50cm/pixel) RGB satellite tiles.

	### Out-of-Scope Use
	- General object detection (e.g., cars, roads).
	- Imagery with completely different spatial resolutions (e.g., 30m Landsat data) without fine-tuning.

	## Training Details

	### Dataset
	Trained on the [SpaceNet Rio de Janeiro](https://huggingface.co/datasets/harshinde/spacenet-rio) dataset.
	- Total Tiles: 6,940
	- Split: 7:1:2 (Train: 4,857 \| Val: 693 \| Test: 1,387)

	### Hyperparameters
	- Epochs: 100 (with early stopping patience of 15)
	- Batch Size: 16 (Train) / 4 (Val)
	- Learning Rate: 0.001 with 5 warmup epochs
	- Weight Decay: 0.0001
	- Loss Function: Combined Dice Loss (weight 1.0) + Cross-Entropy Loss (weight 1.0)
	- Image Crops: 400x400 (Train) / 480x480 (Val)

	### Training Metrics
	Training metrics were tracked using TensorBoard and include:
	- Training/Validation Loss
	- Mean IoU and Per-Class IoU
	- Pixel Accuracy

	You can view the full training logs and curves [here on TensorBoard](https://huggingface.co/harshinde/spacenet/tensorboard).

	## How to Get Started with the Model

	You can load the weights using PyTorch:

	```python
	import torch

	# Assuming the U-Net architecture is defined in your local code
	# model = UNet(in_channels=3, num_classes=2)

	checkpoint = torch.load("best_model.pt", map_location="cpu")

	# Depending on how the state dict was saved, load it into the model
	# model.load_state_dict(checkpoint['model_state_dict']) # if saved as a dictionary
	# OR
	# model.load_state_dict(checkpoint) # if saved as raw state_dict

	model.eval()
	```