HEIMDALL — PROFusion Pose Regression

Paper: PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization arXiv: 2509.24236 Authors: Dong, Wang, Cai, Ma, Yang (University of Hong Kong)

Model Description

Dual-stream pose regression network (RGB + Depth) for robust camera pose estimation. Part of the ANIMA robotics perception stack (Wave 6).

Architecture

Backbone: ResNet-18 (dual-stream: RGB encoder + Depth encoder)
Attention: Depth-conditioned channel attention
Decoder: FC layers → quaternion (4D) + translation (3D)
Input: RGB-D frame pairs (224x224)
Output: Relative SE(3) pose matrix

Usage

import torch
from anima_heimdall.models.pose_net import PoseRegressionNet

model = PoseRegressionNet(backbone="resnet18")
model.load_state_dict(torch.load("pytorch/heimdall_pose_v1.pth")["model_state_dict"])
model.eval()

rgb_a = torch.randn(1, 3, 224, 224)
depth_a = torch.randn(1, 1, 224, 224)
rgb_b = torch.randn(1, 3, 224, 224)
depth_b = torch.randn(1, 1, 224, 224)

quat, trans = model(rgb_a, depth_a, rgb_b, depth_b)

Training

Trained on TUM-VI sequences with ANIMA Training Standard v1.0. Checkpoint: project_heimdall_cuda_v1_epoch34_val0.0093.pth

Citation

@article{dong2025profusion,
  title={PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization},
  author={Dong, Siyan and Wang, Zijun and Cai, Lulu and Ma, Yi and Yang, Yanchao},
  journal={arXiv preprint arXiv:2509.24236},
  year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for ilessio-aiflowlab/project_heimdall

PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization

Paper • 2509.24236 • Published Sep 29, 2025 • 2