G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation
🌐 Project Page | 📄 arXiv | 💻 Code (GitHub)
Recovering the relative 6-DoF pose between two image groups underlies cross-sequence relocalization, multi-camera rig odometry, and other multi-view tasks. Each group carries known intra-group geometry from a pre-built map, odometry, or rig calibration, and pretrained multi-view backbones already fuse such geometry into visual features. Yet current models treat all views as an unstructured set, leaving cross-group reasoning as the missing piece.
G2G keeps a multi-view foundation model entirely frozen and adds three lightweight trainable modules (32M parameters, under 6% of the full model) to bridge the two groups: a perceiver resampler, a cross-group bridge with merged self-attention, and a multi-frame pose head. Supervised only by relative poses, G2G attains state-of-the-art accuracy on both tasks across four datasets.
This repository hosts the released artifacts for the paper. Code, installation, and full usage live on GitHub: https://github.com/WeiYuFei0217/G2G
Contents
| Path | Description |
|---|---|
release_weights/*.pth |
10 pretrained G2G-only weights (frozen backbone excluded; each ~123 MB) |
map-anything-model/ |
Frozen MapAnything backbone (DINOv2-large/1024, ~2.1 GB) |
examples.zip |
Sanity-check input bundles (reloc/ + rig/) |
eval_results.zip |
Paper-subset per-pair evaluation CSVs |
MapAnything backbone (mirrored here). G2G runs on a frozen DINOv2-large / 1024-dim MapAnything backbone (MapAnything v1.0.1, commit
fde8425). The exact compatible checkpoint (~2.1 GB) is included undermap-anything-model/, because the currentfacebook/map-anythingHugging Face weights are the newer giant / 1536-dim variant and are incompatible with these G2G modules. MapAnything is a work by Meta AI (facebookresearch/map-anything); please also respect its original license when using this backbone.
Pretrained weights
| Weight | Task | Dataset |
|---|---|---|
HM3D-Reloc.pth |
Reloc | HM3D |
TartanGround-Reloc.pth |
Reloc | TartanGround |
NCLT-Reloc.pth |
Reloc | NCLT |
ZJH-Reloc.pth |
Reloc | ZJH |
HM3D-Rig-8.pth |
Rig | HM3D (8-cam) |
HM3D-Rig-4.pth |
Rig | HM3D (4-cam) |
TartanGround-Rig-4.pth |
Rig | TartanGround (4-cam) |
NCLT-Rig-Intra.pth |
Rig | NCLT intra-season (5-cam) |
NCLT-Rig-Cross.pth |
Rig | NCLT cross-season (5-cam) |
ZJH-Rig-4.pth |
Rig | ZJH (4-cam) |
These are G2G-only weights (frozen backbone excluded). The evaluation scripts in the GitHub repo automatically handle partial loading.
Download
Grab a single weight:
from huggingface_hub import hf_hub_download
ckpt = hf_hub_download(
repo_id="feixue22/G2G",
filename="release_weights/HM3D-Reloc.pth",
)
print(ckpt)
Or pull everything (weights + example/eval bundles):
from huggingface_hub import snapshot_download
local_dir = snapshot_download(repo_id="feixue22/G2G")
print(local_dir)
Usage
These weights plug into the G2G code on GitHub. After cloning and installing (https://github.com/WeiYuFei0217/G2G), run evaluation with the downloaded checkpoint:
python scripts/eval_reloc.py \
--config configs/reloc/hm3d.yaml \
--checkpoint release_weights/HM3D-Reloc.pth \
--output-dir outputs/eval_HM3D-Reloc \
--batch-size 16 --min-overlap 0.1
License
Citation
@misc{wei2026g2gexploitingintragroupgeometry,
title={G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation},
author={Yufei Wei and Shuhao Ye and Chenxiao Hu and Yiyuan Pan and Dongyu Feng and Rong Xiong and Yue Wang and Yanmei Jiao},
year={2026},
eprint={2606.08284},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2606.08284},
}