| --- |
| license: mit |
| tags: |
| - semantic-segmentation |
| - cartography |
| - historical-maps |
| - unet |
| - cbam |
| - efficientnet |
| library_name: pytorch |
| pipeline_tag: image-segmentation |
| --- |
| |
| # Historical Map Semantic Segmentation — Ensemble Checkpoints |
|
|
|  |
|
|
| Three U-Net + CBAM (EfficientNet-B5 encoder) checkpoints used as a 3-way |
| probability-averaging ensemble for 7-class semantic segmentation of historical |
| cartographic scans. Best Kaggle score: **0.77044** (`score = 0.6 · mIoU + 0.4 · macro-F1`). |
|
|
| Code: https://github.com/VictorPachecoAznar/Comp1_RTCart |
| |
| ## Files |
| |
| | Path | Role | Trained on | Validated on | Val score | |
| |------|------|------------|--------------|-----------| |
| | `map2_specialist/map2_specialist.pth` | map2-specialist | map2 only | map1 | 0.7233 | |
| | `map1_specialist/map1_specialist.pth` | map1-specialist | map1 only | map2 | 0.7029 | |
| | `tile_cv_generalist/tile_cv_generalist.pth` | tile-CV generalist (fold 1) | tiles from both maps | held-out fold | 0.8754 | |
| |
| Each directory also includes the `config.yaml` used at training time. |
| |
| ## Classes |
| |
| `["River", "Forest", "Lake", "Wetland", "Stream", "Building", "Road"]` — |
| one binary channel per class. |
| |
| ## Quick use |
| |
| ```python |
| import torch |
| from huggingface_hub import hf_hub_download |
|
|
| # Pull one checkpoint |
| ckpt_path = hf_hub_download( |
| repo_id="Noe-B/historic-map-semantic-segmentation", |
| filename="map2_specialist/map2_specialist.pth", |
| ) |
| |
| # Load (requires the model definition from the GitHub repo) |
| ckpt = torch.load(ckpt_path, map_location="cpu") |
| state = ckpt["model_state"] |
| # from src.training.models import get_model |
| # model = get_model("unet_cbam", encoder_name="efficientnet-b5") |
| # model.load_state_dict(state) |
| ``` |
| |
| For full inference (all 3 checkpoints, ensemble averaging, threshold 0.33), |
| see the [`4_submit.py` script in the GitHub repo](https://github.com/VictorPachecoAznar/Comp1_RTCart/blob/main/scripts/pipeline/4_submit.py). |
| |
| ## Input/output shape |
| |
| - **Input:** RGB tile, `(3, 768, 768)`, ImageNet-normalised (`mean=[0.485, 0.456, 0.406]`, `std=[0.229, 0.224, 0.225]`) |
| - **Output:** logits, `(7, 768, 768)`; apply `sigmoid` then threshold (recommended `0.33`) |
| |