File size: 10,127 Bytes
d6d8100 fa35b10 bbe197f d6d8100 bbe197f d6d8100 bbe197f 043d75e bbe197f 71009bd bbe197f 355bd83 bbe197f 8c294a6 e5485e4 53e82e6 408a91a bbe197f 355bd83 bbe197f 9161c92 bbe197f d454119 bbe197f c80b027 b42a855 9664c45 c80b027 bbe197f e7a6138 355bd83 e7a6138 355bd83 e7a6138 355bd83 e7a6138 355bd83 bbe197f 4980911 6c7d0b8 4980911 bcce3e6 8cf292b bcce3e6 bbe197f 71009bd bbe197f e136677 bbe197f 883e09d bbe197f 58798e6 bbe197f 58798e6 8cf292b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 | ---
title: OXE-AugE
emoji: 🦾
colorFrom: blue
colorTo: indigo
sdk: static
pinned: true
---
<p align="center">
<img src="splash_v4.jpeg" alt="OXE-AugE Banner" style="max-width: 100%; height: auto;">
</p>
<h1 align="center">OXE-AugE: Cross-Embodiment Robot Augmentation at Scale</h1>
<p align="center">
<b>Augmentation for OXE dataset</b><br/>
</p>
<p align="center">
<a href="https://oxe-auge.github.io/" target="_blank"><b>🔗 Project Page</b></a> ·
<a href="https://arxiv.org/abs/2512.13100" target="_blank"><b>📄 Paper</b></a> ·
<a href="https://github.com/GuanhuaJi/oxe-auge" target="_blank"><b>💻 GitHub</b></a> ·
<a href="https://x.com/Lawrence_Y_Chen/status/2001372083704799706 " target="_blank"><b>🐦 Twitter</b></a>
</p>
---
## TL;DR
## What we do
We present **OXE-AugE**, a high-quality open-source dataset that augments **16** popular OXE datasets with **9** different robot embodiments, using a scalable robot augmentation pipeline which we call **AugE-Toolkit**.
## Why it matters
We find **robot augmentation scales**. While the Open X-Embodiment (OXE) dataset aggregates demonstrations from over **60** real-world robot datasets, it is highly imbalanced: over **85%** of real trajectories come from just four robots (**Franka**, **xArm**, **Kuka iiwa**, and **Google Robot**), while many others appear in only **1–2** datasets. By diversifying the robot embodiment while preserving task and scene, **OXE-AugE** provides a new resource for training robust and transferable visuomotor policies. Through both sim and real experiments, we find that scaling robot augmentation enhances robustness, transfer, and generalization.
## What’s included
**OXE-AugE** currently augments **16** commonly-used OXE datasets, resulting in over **4.4 million** trajectories—more than **triple** the size of the original OXE—and covers **60%** of the widely-used Octo pre-training mixture.
---
## 🤖 Robots & Coverage
**Legend:** **●** = source robot | **✓** = augmented demos available.
*For the full, current table, see the Dashboard or dataset READMEs.*
| Dataset | Panda | UR5e | Xarm7 | Google | WidowX | Sawyer | Kinova3 | IIWA | Jaco | # Episodes |
| :--------------------- | :---: | :---: | :---: | :----: | :----: | :----: | :-----: | :--: | :---: | :-----: |
| Berkeley AUTOLab UR5 | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1000 |
| TACO Play | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 3603 |
| Austin BUDS | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 50 |
| Austin Mutex | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1500 |
| Austin Sailor | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 240 |
| CMU Franka Pick-Insert | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 631 |
| KAIST Nonprehensile | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 201 |
| NYU Franka Play | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 456 |
| TOTO | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1003 |
| UTokyo xArm PickPlace | ✓ | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 102 |
| UCSD Kitchen | ✓ | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 150 |
| Austin VIOLA | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 150 |
| Bridge | ✓ | ✓ | ✓ | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | 28935 |
| RT-1 Robot Action | ✓ | ✓ | ✓ | **●** | | ✓ | ✓ | ✓ | ✓ | 87212 |
| Jaco Play | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | **●** | 1084 |
| Language Table | ✓ | ✓ | **●** | ✓ | | ✓ | ✓ | ✓ | ✓ | 442226 |
---
## 📦 How to Use
Below is an example showing how to load a dataset from Hugging Face using LeRobotDataset, then iterate through the first episode and print robot-specific fields for each frame.
```python
import torch
from lerobot.datasets.lerobot_dataset import LeRobotDataset
REPO_ID = "oxe-auge/nyu_franka_play_dataset_val_augmented"
ROBOT = "kuka_iiwa"
EPISODE = 0
KEYS = [
f"observation.images.{ROBOT}",
f"observation.{ROBOT}.joints",
f"observation.{ROBOT}.ee_pose",
f"observation.{ROBOT}.base_position",
f"observation.{ROBOT}.base_orientation",
f"observation.{ROBOT}.ee_error",
]
dataset = LeRobotDataset(REPO_ID)
i = 0
while i < len(dataset) and int(dataset[i]["episode_index"]) == EPISODE:
sample = dataset[i]
print(f"\n--- episode_index={EPISODE}, frame_index={int(sample['frame_index'])}, dataset_index={i} ---")
for k in KEYS:
print(f"{k}: {sample[k]}")
i += 1
print(f"\nNumber of frames in episode {EPISODE}: {i}")
```
## Check EE Error Threshold
During simulator replay, a target robot (e.g., Kuka iiwa) may not be able to exactly reach the source robot’s end-effector pose due to embodiment differences (size, joint limits, etc.). This mismatch is recorded as `observation.<ROBOT>.ee_error`.
The example below scans the first 10 episodes and splits them into two groups: episodes where **all frames** have EE error (\leq0.01), and episodes where **at least one frame** has EE error (>0.01).
```python
import torch
from lerobot.datasets.lerobot_dataset import LeRobotDataset
REPO_ID = "oxe-auge/nyu_franka_play_dataset_val_augmented"
ROBOT = "kuka_iiwa"
EE_ERROR_KEY = f"observation.{ROBOT}.ee_error"
THRESHOLD = 0.01
NUM_EPISODES = 10
dataset = LeRobotDataset(REPO_ID)
within_threshold_episodes = []
over_threshold_episodes = []
ep = 0
i = 0
while ep < NUM_EPISODES and i < len(dataset):
episode_id = int(dataset[i]["episode_index"])
within = True
while i < len(dataset) and int(dataset[i]["episode_index"]) == episode_id:
ee = float(dataset[i][EE_ERROR_KEY].item())
if ee > THRESHOLD:
within = False
i += 1
(within_threshold_episodes if within else over_threshold_episodes).append(episode_id)
ep += 1
print("within_threshold_episodes:", within_threshold_episodes)
print("over_threshold_episodes:", over_threshold_episodes)
```
## Error Statistics
We also provide pre-computed EE error statistics across **all published datasets** and **all robot embodiments**. You can find the full summary **[here](https://drive.google.com/drive/folders/1iBFo1XfoJUwdALVxhDhIXvNGprGxzXZI?usp=drive_link)**. These statistics can serve as a practical reference when choosing an EE-error threshold and deciding whether to filter out episodes with large replay error for your use case.
## IOU Statistics
Another useful signal for filtering is the **mask IOU** between the simulator mask and the SAM mask. This is only relevant for **Bridge** and **RT-1 Robot Actions**, where camera poses are **not fixed**. To build these datasets, we **sample camera positions/orientations** and then, for each episode, select one or a few viewpoints that best match the sampled cameras (We use the IOU between the entire robot's simulation and SAM mask for Bridge, and the IOU between the gripper's simulation and SAM mask for RT-1 Robot Actions). As a result, the viewpoint match can be less precise than datasets with fixed cameras.
We provide **per-episode IOU summaries** (min / max / mean) that you can use to filter low-quality episodes. As a rule of thumb, we recommend:
* **Bridge:** filter episodes with mean IOU < **0.6**
* **RT-1 Robot Actions:** filter episodes with mean IOU < **0.4**
Our IOU parquet files can be downloaded **[here](https://drive.google.com/drive/folders/1Gv6mK-UO7b45KVZgfMSWHT6kTI4vqsQH?usp=drive_link)**.
We found that many users want to know which sampled viewpoint is selected for each Bridge episode. We provide the Bridge episode-to-viewpoint mapping **[here](https://huggingface.co/datasets/oxe-auge/bridge_episode_viewpoints)**.
Here is an example snippet showing how to filter episodes using the IOU threshold.
```python
#!/usr/bin/env python3
from pathlib import Path
import pyarrow.parquet as pq
PARQUET_PATH = Path("/path/to/iou_episode_stats.parquet") # update this path
THRESH = 0.6 # e.g., 0.6 for Dataset A, 0.4 for Dataset B
df = pq.read_table(PARQUET_PATH, columns=["episode_id", "iou_mean"]).to_pandas()
bad = df[df["iou_mean"] < THRESH]
print(f"episodes with iou_mean < {THRESH}: {len(bad)}")
print(bad.to_string(index=False))
```
---
## 🗓 Updates
* {{2025-11}} — Released **All 16 augmented datasets**
---
## 📚 Citation
If you use OXE-AugE datasets or tools, please cite:
```bibtex
@article{ji2025oxeauge,
title={OXE-AugE: A Large-Scale Robot Augmentation of OXE for Scaling Cross-Embodiment Policy Learning},
author={Ji, Guanhua and Polavaram, Harsha and Chen, Lawrence Yunliang and Bajamahal, Sandeep and Ma, Zehan and Adebola, Simeon and Xu, Chenfeng and Goldberg, Ken},
journal={arXiv preprint arXiv:2512.13100},
year={2025}
}
```
Also cite upstream datasets you rely on (see per-shard READMEs for references).
---
## 🪪 License & Responsible Use
* **Datasets:** CC BY 4.0.
* **Code:** Apache-2.0 / MIT.
* **Responsible Use:** No personal data; research/robotics use; do not deploy in unlawful or harmful contexts.
---
## 🤝 Contribute & Contact
* Contribute new shards or fixes via Issues/PRs on the corresponding dataset repos.
* For questions or issues, please reach out to Guanhua Ji: **[jgh1013@seas.upenn.edu](jgh1013@seas.upenn.edu)**
* Organization home: [https://huggingface.co/oxe-auge](https://huggingface.co/oxe-auge)
|