ResilientMap โ Stage 1 (Pixel-Gate Fusion), nuScenes old-split
Checkpoints and training artifacts for Stage 1 of the ResilientMap project: camera + satellite BEV fusion via a confidence-weighted pixel gate, trained on the nuScenes old split (full data).
This is the BEV-feature pre-training stage. The vectorized HD map decoder (MMQuery, Stage 2/3) is trained on top of these BEVs and is released separately.
Files
| File | Size | Notes |
|---|---|---|
iter_55932.pth |
897 MB | Final checkpoint (latest, recommended) |
iter_37288.pth |
897 MB | Intermediate checkpoint (4-th of 6 saves) |
satmaptracker_pixelgate_stage1_full.py |
23 KB | Full training/eval config |
20260417_174220.log |
862 KB | Training log (text) |
20260417_174220.log.json |
857 KB | Training log (JSON, per-iter metrics) |
tf_logs/ |
โ | TensorBoard event files |
Earlier intermediate checkpoints (
iter_9322,iter_18644,iter_27966,iter_46610) were not uploaded to keep the repo lean. Thelatest.pthsymlink in the original work-dir pointed toiter_55932.pth.
Architecture (one paragraph)
A BEVFormer (ResNet-50 + FPN + TemporalSelfAttention) camera branch produces a
(256, 50, 100) BEV. A separate ResNet-50 + FPN satellite encoder produces an
ego-aligned satellite BEV at the same shape. Two frozen segmentation decoders
output per-pixel confidences for each modality, and the BEVs are combined as
g(x,y) = cam_conf(x,y) / (cam_conf(x,y) + sat_conf(x,y) + eps)
fused_bev = g * cam_bev + (1 - g) * sat_bev
This residual-free, confidence-weighted gate gives the fused stream a clean fall-back to whichever modality is more reliable per pixel.
Headline numbers (nuScenes new-split, 1/3 data, BEV mIoU; this checkpoint is the old-split / full-data counterpart, retraining of the same recipe)
| Condition | Fused | Cam-only | Sat-only |
|---|---|---|---|
| Clean | 0.512 | 0.452 | 0.403 |
| 6 cameras zeroed | 0.381 | 0.013 | 0.403 |
| Satellite zeroed | 0.436 | 0.452 | 0.000 |
Per-camera importance (impact when a single camera is zeroed):
FRONT -0.058 >> BACK -0.027 > FRONT_LEFT/RIGHT -0.018/-0.015 > BACK_LEFT/RIGHT -0.006.
Intended use
Research checkpoint for online HD map construction with multi-modal sensor redundancy. Not production-ready.
Citation
Citation entry will be added when the accompanying paper is released.