---
license: apache-2.0
library_name: diffusers
---
RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer
Liu Liu,
Xiaofeng Wang,
Guosheng Zhao,
Keyu Li,
Wenkang Qin,
Jiaxiong Qiu,
Zheng Zhu,
Guan Huang,
Zhizhong Su
---
## π Abstract

**RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.
---
## π§ Key Features
- π **Geometry-Consistent Diffusion**: Injects global 3D cues (depth, normal) and cross-view interactions for multi-view realism.
- π§© **Scene Component Control**: Enables manipulation of object attributes (pose, identity) and background features.
- π **Cross-View Conditioning**: Learns representations from multiple camera views with spatial correspondence.
- π€ **Robotic Policy Transfer**: Facilitates domain adaptation by generating synthetic training data in target domains.
---
## π BibTeX
```bibtex
@article{liu2025robotransfer,
title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer},
author={Liu, Liu and Wang, Xiaofeng and Zhao, Guosheng and Li, Keyu and Qin, Wenkang and Qiu, Jiaxiong and Zhu, Zheng and Huang, Guan and Su, Zhizhong},
journal={arXiv preprint arXiv:2505.23171},
year={2025}
}