Historical Map Semantic Segmentation โ€” Ensemble Checkpoints

Input RGB strip and our ensemble's predictions on a region of map2

Three U-Net + CBAM (EfficientNet-B5 encoder) checkpoints used as a 3-way probability-averaging ensemble for 7-class semantic segmentation of historical cartographic scans. Best Kaggle score: 0.77044 (score = 0.6 ยท mIoU + 0.4 ยท macro-F1).

Code: https://github.com/VictorPachecoAznar/Comp1_RTCart

Files

Path Role Trained on Validated on Val score
map2_specialist/map2_specialist.pth map2-specialist map2 only map1 0.7233
map1_specialist/map1_specialist.pth map1-specialist map1 only map2 0.7029
tile_cv_generalist/tile_cv_generalist.pth tile-CV generalist (fold 1) tiles from both maps held-out fold 0.8754

Each directory also includes the config.yaml used at training time.

Classes

["River", "Forest", "Lake", "Wetland", "Stream", "Building", "Road"] โ€” one binary channel per class.

Quick use

import torch
from huggingface_hub import hf_hub_download

# Pull one checkpoint
ckpt_path = hf_hub_download(
    repo_id="Noe-B/historic-map-semantic-segmentation",
    filename="map2_specialist/map2_specialist.pth",
)

# Load (requires the model definition from the GitHub repo)
ckpt  = torch.load(ckpt_path, map_location="cpu")
state = ckpt["model_state"]
# from src.training.models import get_model
# model = get_model("unet_cbam", encoder_name="efficientnet-b5")
# model.load_state_dict(state)

For full inference (all 3 checkpoints, ensemble averaging, threshold 0.33), see the 4_submit.py script in the GitHub repo.

Input/output shape

  • Input: RGB tile, (3, 768, 768), ImageNet-normalised (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
  • Output: logits, (7, 768, 768); apply sigmoid then threshold (recommended 0.33)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support