Noe

Add prediction banner (LFS) to model card; remove citation

3b77c5f 20 days ago

2.24 kB

	---
	license: mit
	tags:
	- semantic-segmentation
	- cartography
	- historical-maps
	- unet
	- cbam
	- efficientnet
	library_name: pytorch
	pipeline_tag: image-segmentation
	---

	# Historical Map Semantic Segmentation — Ensemble Checkpoints

	![Input RGB strip and our ensemble's predictions on a region of map2](prediction_strip.png)

	Three U-Net + CBAM (EfficientNet-B5 encoder) checkpoints used as a 3-way
	probability-averaging ensemble for 7-class semantic segmentation of historical
	cartographic scans. Best Kaggle score: 0.77044 (`score = 0.6 · mIoU + 0.4 · macro-F1`).

	Code: https://github.com/VictorPachecoAznar/Comp1_RTCart

	## Files

	\| Path \| Role \| Trained on \| Validated on \| Val score \|
	\|------\|------\|------------\|--------------\|-----------\|
	\| `map2_specialist/map2_specialist.pth` \| map2-specialist \| map2 only \| map1 \| 0.7233 \|
	\| `map1_specialist/map1_specialist.pth` \| map1-specialist \| map1 only \| map2 \| 0.7029 \|
	\| `tile_cv_generalist/tile_cv_generalist.pth` \| tile-CV generalist (fold 1) \| tiles from both maps \| held-out fold \| 0.8754 \|

	Each directory also includes the `config.yaml` used at training time.

	## Classes

	`["River", "Forest", "Lake", "Wetland", "Stream", "Building", "Road"]` —
	one binary channel per class.

	## Quick use

	```python
	import torch
	from huggingface_hub import hf_hub_download

	# Pull one checkpoint
	ckpt_path = hf_hub_download(
	repo_id="Noe-B/historic-map-semantic-segmentation",
	filename="map2_specialist/map2_specialist.pth",
	)

	# Load (requires the model definition from the GitHub repo)
	ckpt = torch.load(ckpt_path, map_location="cpu")
	state = ckpt["model_state"]
	# from src.training.models import get_model
	# model = get_model("unet_cbam", encoder_name="efficientnet-b5")
	# model.load_state_dict(state)
	```

	For full inference (all 3 checkpoints, ensemble averaging, threshold 0.33),
	see the [`4_submit.py` script in the GitHub repo](https://github.com/VictorPachecoAznar/Comp1_RTCart/blob/main/scripts/pipeline/4_submit.py).

	## Input/output shape

	- Input: RGB tile, `(3, 768, 768)`, ImageNet-normalised (`mean=[0.485, 0.456, 0.406]`, `std=[0.229, 0.224, 0.225]`)
	- Output: logits, `(7, 768, 768)`; apply `sigmoid` then threshold (recommended `0.33`)