Instructions to use HorizonRobotics/RoboTransfer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use HorizonRobotics/RoboTransfer with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("HorizonRobotics/RoboTransfer", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -36,12 +36,14 @@ library_name: diffusers
|
|
| 36 |
</div>
|
| 37 |
|
| 38 |
<div align="center">
|
| 39 |
-
<img src="assets/pin.
|
| 40 |
|
| 41 |
---
|
| 42 |
|
| 43 |
## ๐ Abstract
|
| 44 |
|
|
|
|
|
|
|
| 45 |
**RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.
|
| 46 |
|
| 47 |
---
|
|
@@ -55,15 +57,6 @@ library_name: diffusers
|
|
| 55 |
|
| 56 |
---
|
| 57 |
|
| 58 |
-
|
| 59 |
-
## ๐ธ Framework Overview
|
| 60 |
-
|
| 61 |
-

|
| 62 |
-
|
| 63 |
-
> The overall architecture includes view-specific encoding, geometry injection, diffusion denoising with spatial constraints, and component-level editing modules. Our system enables compositional control over scene dynamics while preserving physical and geometric consistency.
|
| 64 |
-
|
| 65 |
-
---
|
| 66 |
-
|
| 67 |
## ๐ BibTeX
|
| 68 |
|
| 69 |
```bibtex
|
|
|
|
| 36 |
</div>
|
| 37 |
|
| 38 |
<div align="center">
|
| 39 |
+
<img src="assets/pin.jpeg" width="50%" alt="RoboTransfer"/></div>
|
| 40 |
|
| 41 |
---
|
| 42 |
|
| 43 |
## ๐ Abstract
|
| 44 |
|
| 45 |
+

|
| 46 |
+
|
| 47 |
**RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.
|
| 48 |
|
| 49 |
---
|
|
|
|
| 57 |
|
| 58 |
---
|
| 59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
## ๐ BibTeX
|
| 61 |
|
| 62 |
```bibtex
|