Papers
arxiv:2602.08961

MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

Published on Feb 9
ยท Submitted by
Ruijie Zhu
on Feb 10
Authors:
,
,
,
,
,

Abstract

MotionCrafter is a video diffusion framework that jointly reconstructs 4D geometry and estimates dense motion using a novel joint representation and 4D VAE architecture.

AI-generated summary

We introduce MotionCrafter, a video diffusion-based framework that jointly reconstructs 4D geometry and estimates dense motion from a monocular video. The core of our method is a novel joint representation of dense 3D point maps and 3D scene flows in a shared coordinate system, and a novel 4D VAE to effectively learn this representation. Unlike prior work that forces the 3D value and latents to align strictly with RGB VAE latents-despite their fundamentally different distributions-we show that such alignment is unnecessary and leads to suboptimal performance. Instead, we introduce a new data normalization and VAE training strategy that better transfers diffusion priors and greatly improves reconstruction quality. Extensive experiments across multiple datasets demonstrate that MotionCrafter achieves state-of-the-art performance in both geometry reconstruction and dense scene flow estimation, delivering 38.64% and 25.0% improvements in geometry and motion reconstruction, respectively, all without any post-optimization. Project page: https://ruijiezhu94.github.io/MotionCrafter_Page

Community

Paper author Paper submitter

๐Ÿš€ Excited to share our latest work MotionCrafter!

๐ŸŒŸ The first Video Diffusion-based framework for joint geometry and motion estimation.

๐Ÿ“„ Paper: http://arxiv.org/abs/2602.08961
๐ŸŒ Project page: https://ruijiezhu94.github.io/MotionCrafter_Page
๐Ÿ’ป Code: https://github.com/TencentARC/MotionCrafter
๐Ÿค— HF Models: https://huggingface.co/TencentARC/MotionCrafter

๐Ÿ˜‹ Both training and inference code are provided!

๐Ÿ˜„ Feedback and discussions are very welcome!

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.08961 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.08961 in a Space README.md to link it from this page.

Collections including this paper 4