Papers
arxiv:2607.02075

HandsOnWorld: Unconstrained Egocentric Video Generation with Camera-Disentangled Hand Control

Published on Jul 2
Authors:
,
,
,
,
,

Abstract

A framework for hand-controlled egocentric video generation that uses monocular video reconstruction and a novel 3D-aware control signal to overcome limitations of existing methods relying on multi-view or marker-based capture.

We present HandsOnWorld, a framework for hand-controlled egocentric video generation that forgoes multi-view and marker-based motion capture, learning instead from unconstrained monocular video. Such generality is bottlenecked by the scarcity of scalable 3D hand annotations: large egocentric corpora lack finger-level labels, whereas precise hand datasets are confined to narrow, instrumented settings, limiting prior hand-controlled generators to restricted scene distributions. We instead annotate 3D hands directly on in-the-wild egocentric video through monocular reconstruction, introducing a protagonist-centered annotation pipeline that filters the reconstructions at the action-semantic, image-quality, and 3D-geometric levels to build EgoVid-Pro, a dataset of clean, protagonist-only hand trajectories spanning 103K clips and roughly 12M frames across diverse everyday scenes. To resolve the camera-hand entanglement induced by large ego-motion, we further propose the Plücker Hand Map, a 3D-aware control signal that extends Plücker-ray representations from camera rays to the hand surface, disentangling camera and hand motion at the representation level. Experiments show that \method surpasses prior hand-controlled generators in reconstruction fidelity and control accuracy, and generalizes to out-of-distribution everyday scenes beyond the laboratory datasets on which prior methods rely.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2607.02075
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2607.02075 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2607.02075 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.