ResilientMap โ€” Stage 1 (Pixel-Gate Fusion), nuScenes old-split

Checkpoints and training artifacts for Stage 1 of the ResilientMap project: camera + satellite BEV fusion via a confidence-weighted pixel gate, trained on the nuScenes old split (full data).

This is the BEV-feature pre-training stage. The vectorized HD map decoder (MMQuery, Stage 2/3) is trained on top of these BEVs and is released separately.

Files

File Size Notes
iter_55932.pth 897 MB Final checkpoint (latest, recommended)
iter_37288.pth 897 MB Intermediate checkpoint (4-th of 6 saves)
satmaptracker_pixelgate_stage1_full.py 23 KB Full training/eval config
20260417_174220.log 862 KB Training log (text)
20260417_174220.log.json 857 KB Training log (JSON, per-iter metrics)
tf_logs/ โ€” TensorBoard event files

Earlier intermediate checkpoints (iter_9322, iter_18644, iter_27966, iter_46610) were not uploaded to keep the repo lean. The latest.pth symlink in the original work-dir pointed to iter_55932.pth.

Architecture (one paragraph)

A BEVFormer (ResNet-50 + FPN + TemporalSelfAttention) camera branch produces a (256, 50, 100) BEV. A separate ResNet-50 + FPN satellite encoder produces an ego-aligned satellite BEV at the same shape. Two frozen segmentation decoders output per-pixel confidences for each modality, and the BEVs are combined as

g(x,y)    = cam_conf(x,y) / (cam_conf(x,y) + sat_conf(x,y) + eps)
fused_bev = g * cam_bev + (1 - g) * sat_bev

This residual-free, confidence-weighted gate gives the fused stream a clean fall-back to whichever modality is more reliable per pixel.

Headline numbers (nuScenes new-split, 1/3 data, BEV mIoU; this checkpoint is the old-split / full-data counterpart, retraining of the same recipe)

Condition Fused Cam-only Sat-only
Clean 0.512 0.452 0.403
6 cameras zeroed 0.381 0.013 0.403
Satellite zeroed 0.436 0.452 0.000

Per-camera importance (impact when a single camera is zeroed): FRONT -0.058 >> BACK -0.027 > FRONT_LEFT/RIGHT -0.018/-0.015 > BACK_LEFT/RIGHT -0.006.

Intended use

Research checkpoint for online HD map construction with multi-modal sensor redundancy. Not production-ready.

Citation

Citation entry will be added when the accompanying paper is released.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support