HEIMDALL โ€” PROFusion Pose Regression

Paper: PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization arXiv: 2509.24236 Authors: Dong, Wang, Cai, Ma, Yang (University of Hong Kong)

Model Description

Dual-stream pose regression network (RGB + Depth) for robust camera pose estimation. Part of the ANIMA robotics perception stack (Wave 6).

Architecture

  • Backbone: ResNet-18 (dual-stream: RGB encoder + Depth encoder)
  • Attention: Depth-conditioned channel attention
  • Decoder: FC layers โ†’ quaternion (4D) + translation (3D)
  • Input: RGB-D frame pairs (224x224)
  • Output: Relative SE(3) pose matrix

Usage

import torch
from anima_heimdall.models.pose_net import PoseRegressionNet

model = PoseRegressionNet(backbone="resnet18")
model.load_state_dict(torch.load("pytorch/heimdall_pose_v1.pth")["model_state_dict"])
model.eval()

rgb_a = torch.randn(1, 3, 224, 224)
depth_a = torch.randn(1, 1, 224, 224)
rgb_b = torch.randn(1, 3, 224, 224)
depth_b = torch.randn(1, 1, 224, 224)

quat, trans = model(rgb_a, depth_a, rgb_b, depth_b)

Training

Trained on TUM-VI sequences with ANIMA Training Standard v1.0. Checkpoint: project_heimdall_cuda_v1_epoch34_val0.0093.pth

Citation

@article{dong2025profusion,
  title={PROFusion: Robust and Accurate Dense Reconstruction via Camera Pose Regression and Optimization},
  author={Dong, Siyan and Wang, Zijun and Cai, Lulu and Ma, Yi and Yang, Yanchao},
  journal={arXiv preprint arXiv:2509.24236},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for ilessio-aiflowlab/project_heimdall