--- license: apache-2.0 library_name: diffusers ---

RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer

Liu Liu, Xiaofeng Wang, Guosheng Zhao, Keyu Li, Wenkang Qin, Jiaxiong Qiu, Zheng Zhu, Guan Huang, Zhizhong Su
Code Project Page arXiv Video 中文介绍
RoboTransfer
--- ## πŸ” Abstract ![RoboTransfer Pipeline](assets/robotransfer.jpg) **RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning. --- ## 🧠 Key Features - πŸ“ **Geometry-Consistent Diffusion**: Injects global 3D cues (depth, normal) and cross-view interactions for multi-view realism. - 🧩 **Scene Component Control**: Enables manipulation of object attributes (pose, identity) and background features. - πŸ” **Cross-View Conditioning**: Learns representations from multiple camera views with spatial correspondence. - πŸ€– **Robotic Policy Transfer**: Facilitates domain adaptation by generating synthetic training data in target domains. --- ## πŸ“– BibTeX ```bibtex @article{liu2025robotransfer, title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer}, author={Liu, Liu and Wang, Xiaofeng and Zhao, Guosheng and Li, Keyu and Qin, Wenkang and Qiu, Jiaxiong and Zhu, Zheng and Huang, Guan and Su, Zhizhong}, journal={arXiv preprint arXiv:2505.23171}, year={2025} }