---
license: mit
tags:
  - semantic-segmentation
  - cartography
  - historical-maps
  - unet
  - cbam
  - efficientnet
library_name: pytorch
pipeline_tag: image-segmentation
---

# Historical Map Semantic Segmentation — Ensemble Checkpoints

![Input RGB strip and our ensemble's predictions on a region of map2](prediction_strip.png)

Three U-Net + CBAM (EfficientNet-B5 encoder) checkpoints used as a 3-way
probability-averaging ensemble for 7-class semantic segmentation of historical
cartographic scans. Best Kaggle score: **0.77044** (`score = 0.6 · mIoU + 0.4 · macro-F1`).

Code: https://github.com/VictorPachecoAznar/Comp1_RTCart

## Files

| Path | Role | Trained on | Validated on | Val score |
|------|------|------------|--------------|-----------|
| `map2_specialist/map2_specialist.pth` | map2-specialist | map2 only | map1 | 0.7233 |
| `map1_specialist/map1_specialist.pth` | map1-specialist | map1 only | map2 | 0.7029 |
| `tile_cv_generalist/tile_cv_generalist.pth` | tile-CV generalist (fold 1) | tiles from both maps | held-out fold | 0.8754 |

Each directory also includes the `config.yaml` used at training time.

## Classes

`["River", "Forest", "Lake", "Wetland", "Stream", "Building", "Road"]` —
one binary channel per class.

## Quick use

```python
import torch
from huggingface_hub import hf_hub_download

# Pull one checkpoint
ckpt_path = hf_hub_download(
    repo_id="Noe-B/historic-map-semantic-segmentation",
    filename="map2_specialist/map2_specialist.pth",
)

# Load (requires the model definition from the GitHub repo)
ckpt  = torch.load(ckpt_path, map_location="cpu")
state = ckpt["model_state"]
# from src.training.models import get_model
# model = get_model("unet_cbam", encoder_name="efficientnet-b5")
# model.load_state_dict(state)
```

For full inference (all 3 checkpoints, ensemble averaging, threshold 0.33),
see the [`4_submit.py` script in the GitHub repo](https://github.com/VictorPachecoAznar/Comp1_RTCart/blob/main/scripts/pipeline/4_submit.py).

## Input/output shape

- **Input:** RGB tile, `(3, 768, 768)`, ImageNet-normalised (`mean=[0.485, 0.456, 0.406]`, `std=[0.229, 0.224, 0.225]`)
- **Output:** logits, `(7, 768, 768)`; apply `sigmoid` then threshold (recommended `0.33`)