Historical Map City-Block Vectorisation — 5-fold CV ensemble

EfficientNet-B4 UNet + SCSE attention checkpoints from the 5-fold cross-validation ensemble used for the Research Topics in Cartography (RTCart) 2026 Task 2 competition. Together they constitute the model that scored 0.84690 on the Kaggle leaderboard (score = 0.4 × c-IoU + 0.6 × c-PoLiS).

Historical map of Paris (top) and city-block predictions from this ensemble (bottom)

The corresponding code lives in the NB11/block_vectorization repository — clone that to actually run inference.

What's here

Folder Fold Val IoU (held-out tiles)
breezy-yogurt-59/ 0 0.9845
autumn-lake-60/ 1 0.9852
legendary-deluge-61/ 2 0.9870
serene-durian-62/ 3 0.9824
worthy-blaze-63/ 4 0.9826

Each folder contains:

  • checkpoint.pth — model state dict (~80 MB)
  • config.yaml — exact training config

Plus cv5_manifest.json — fold index → folder mapping for the inference orchestrator.

How to use

# 1. Clone the code repo
git clone https://github.com/NB11/block_vectorization.git
cd block_vectorization
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2. Download these weights into the runs/ tree
mkdir -p runs
for fold in breezy-yogurt-59 autumn-lake-60 legendary-deluge-61 serene-durian-62 worthy-blaze-63; do
  mkdir -p "runs/$fold"
  curl -L "https://huggingface.co/Noe-B/historical-map-city-block-vectorization/resolve/main/$fold/checkpoint.pth" \
       -o "runs/$fold/checkpoint.pth"
  curl -L "https://huggingface.co/Noe-B/historical-map-city-block-vectorization/resolve/main/$fold/config.yaml" \
       -o "runs/$fold/config.yaml"
done
curl -L "https://huggingface.co/Noe-B/historical-map-city-block-vectorization/resolve/main/cv5_manifest.json" \
     -o "runs/cv5_manifest.json"

# 3. Generate the data manifest + run the ensemble (requires the raw maps in data/raw/)
python scripts/pipeline/1_preprocess.py
python scripts/pipeline/1b_make_cv_manifest.py --block-size 2 --n-folds 5
python scripts/pipeline/3b_infer_cv.py $(jq -r '.folds[]' runs/cv5_manifest.json)
python scripts/pipeline/5_postprocess.py runs/cv5-breezy-yogurt-59 config/postprocess_cv_optimal.yaml
python scripts/pipeline/6_submit.py      runs/cv5-breezy-yogurt-59 config/postprocess_cv_optimal.yaml

Model

  • Architecture: segmentation_models_pytorch UNet decoder with SCSE attention, EfficientNet-B4 encoder (ImageNet-pretrained), 768 × 768 input, 2-channel output (interior + boundary ring).
  • Loss: BceLovász on interior + BCEDice on boundary ring.
  • Training: AdamW + cosine LR schedule, batch_size 2, early-stopping patience 30, max 120 epochs. All 5 folds warm-started from the Stage 2 anchor; each was fine-tuned at LR 4e-5 on its held-out fold split.
  • Data: ~190 tiles per fold (the other ~48 held out for validation), spanning both labelled maps. 5-fold spatial-block stratified CV (block_size_tiles = 2) prevents train/val leakage.

Results

Stage Approach Kaggle test
Single model (Stage 3) fresh-sunset-57 (boundary head, single-map PP sweep) 0.840
5-fold CV ensemble Same PP as Stage 3 0.83686
5-fold CV + CV-aware PP this release 0.84690

The CV-aware postprocessing config that produced the final score is committed in the code repo at config/postprocess_cv_optimal.yaml.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support