Papers
arxiv:2606.00664

SKIP: Sparse Keyframe Interpolation Paradigm for Efficient Embodied World Models

Published on May 30
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Sparse Keyframe Interpolation Paradigm (SKIP) accelerates embodied world model inference by generating keyframes with a sparse video diffusion model and interpolating missing frames based on robot actions, achieving faster rollout generation with preserved visual quality and effective policy training.

Embodied world models have emerged as a promising paradigm in robotics by predicting how robot actions affect the surrounding scene. However, the rollout inference remains computationally expensive in pixel space, as long-horizon manipulation videos typically have to be generated frame by frame. This cost cannot be easily reduced by indiscriminately dropping frames, since downstream policies rely on complete preservation of sparse task-relevant events such as approach, contact, grasp, and release. To address this challenge, we propose Sparse Keyframe Interpolation Paradigm (SKIP), an event-preserving sparse-to-dense framework that avoids dense frame-by-frame generation. SKIP first identifies task-relevant keyframes by leveraging robot-aware multimodal features. It then synthesizes only these keyframes with a sparse video diffusion model. A learned gap predictor and an action-conditioned interpolator subsequently reconstruct the missing intervals according to the robot actions. On LIBERO, SKIP generates dense rollouts 4.16times faster than a dense baseline while improving visual fidelity and reducing aggregate FVD by 89.0%. Importantly, SKIP-generated videos are effective policy-training data. Even when they fully replace real demonstrations, π_{0.5} success drops only 1.3 pp in LIBERO simulation and 6.7 pp on the real robot, whereas fully dense frame-by-frame generation collapses by 48 to 58 pp.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.00664
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.00664 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.00664 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.00664 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.