--- title: OXE-AugE emoji: 🦾 colorFrom: blue colorTo: indigo sdk: static pinned: true ---

OXE-AugE Banner

OXE-AugE: Cross-Embodiment Robot Augmentation at Scale

Augmentation for OXE dataset

🔗 Project Page · 📄 Paper · 💻 GitHub · 🐦 Twitter

--- ## TL;DR ## What we do We present **OXE-AugE**, a high-quality open-source dataset that augments **16** popular OXE datasets with **9** different robot embodiments, using a scalable robot augmentation pipeline which we call **AugE-Toolkit**. ## Why it matters We find **robot augmentation scales**. While the Open X-Embodiment (OXE) dataset aggregates demonstrations from over **60** real-world robot datasets, it is highly imbalanced: over **85%** of real trajectories come from just four robots (**Franka**, **xArm**, **Kuka iiwa**, and **Google Robot**), while many others appear in only **1–2** datasets. By diversifying the robot embodiment while preserving task and scene, **OXE-AugE** provides a new resource for training robust and transferable visuomotor policies. Through both sim and real experiments, we find that scaling robot augmentation enhances robustness, transfer, and generalization. ## What’s included **OXE-AugE** currently augments **16** commonly-used OXE datasets, resulting in over **4.4 million** trajectories—more than **triple** the size of the original OXE—and covers **60%** of the widely-used Octo pre-training mixture. --- ## 🤖 Robots & Coverage **Legend:** **●** = source robot | **✓** = augmented demos available. *For the full, current table, see the Dashboard or dataset READMEs.* | Dataset | Panda | UR5e | Xarm7 | Google | WidowX | Sawyer | Kinova3 | IIWA | Jaco | # Episodes | | :--------------------- | :---: | :---: | :---: | :----: | :----: | :----: | :-----: | :--: | :---: | :-----: | | Berkeley AUTOLab UR5 | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1000 | | TACO Play | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 3603 | | Austin BUDS | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 50 | | Austin Mutex | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1500 | | Austin Sailor | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 240 | | CMU Franka Pick-Insert | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 631 | | KAIST Nonprehensile | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 201 | | NYU Franka Play | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 456 | | TOTO | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1003 | | UTokyo xArm PickPlace | ✓ | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 102 | | UCSD Kitchen | ✓ | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 150 | | Austin VIOLA | **●** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 150 | | Bridge | ✓ | ✓ | ✓ | ✓ | **●** | ✓ | ✓ | ✓ | ✓ | 28935 | | RT-1 Robot Action | ✓ | ✓ | ✓ | **●** | | ✓ | ✓ | ✓ | ✓ | 87212 | | Jaco Play | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | **●** | 1084 | | Language Table | ✓ | ✓ | **●** | ✓ | | ✓ | ✓ | ✓ | ✓ | 442226 | --- ## 📦 How to Use Below is an example showing how to load a dataset from Hugging Face using LeRobotDataset, then iterate through the first episode and print robot-specific fields for each frame. ```python import torch from lerobot.datasets.lerobot_dataset import LeRobotDataset REPO_ID = "oxe-auge/nyu_franka_play_dataset_val_augmented" ROBOT = "kuka_iiwa" EPISODE = 0 KEYS = [ f"observation.images.{ROBOT}", f"observation.{ROBOT}.joints", f"observation.{ROBOT}.ee_pose", f"observation.{ROBOT}.base_position", f"observation.{ROBOT}.base_orientation", f"observation.{ROBOT}.ee_error", ] dataset = LeRobotDataset(REPO_ID) i = 0 while i < len(dataset) and int(dataset[i]["episode_index"]) == EPISODE: sample = dataset[i] print(f"\n--- episode_index={EPISODE}, frame_index={int(sample['frame_index'])}, dataset_index={i} ---") for k in KEYS: print(f"{k}: {sample[k]}") i += 1 print(f"\nNumber of frames in episode {EPISODE}: {i}") ``` ## Check EE Error Threshold During simulator replay, a target robot (e.g., Kuka iiwa) may not be able to exactly reach the source robot’s end-effector pose due to embodiment differences (size, joint limits, etc.). This mismatch is recorded as `observation..ee_error`. The example below scans the first 10 episodes and splits them into two groups: episodes where **all frames** have EE error (\leq0.01), and episodes where **at least one frame** has EE error (>0.01). ```python import torch from lerobot.datasets.lerobot_dataset import LeRobotDataset REPO_ID = "oxe-auge/nyu_franka_play_dataset_val_augmented" ROBOT = "kuka_iiwa" EE_ERROR_KEY = f"observation.{ROBOT}.ee_error" THRESHOLD = 0.01 NUM_EPISODES = 10 dataset = LeRobotDataset(REPO_ID) within_threshold_episodes = [] over_threshold_episodes = [] ep = 0 i = 0 while ep < NUM_EPISODES and i < len(dataset): episode_id = int(dataset[i]["episode_index"]) within = True while i < len(dataset) and int(dataset[i]["episode_index"]) == episode_id: ee = float(dataset[i][EE_ERROR_KEY].item()) if ee > THRESHOLD: within = False i += 1 (within_threshold_episodes if within else over_threshold_episodes).append(episode_id) ep += 1 print("within_threshold_episodes:", within_threshold_episodes) print("over_threshold_episodes:", over_threshold_episodes) ``` ## Error Statistics We also provide pre-computed EE error statistics across **all published datasets** and **all robot embodiments**. You can find the full summary **[here](https://drive.google.com/drive/folders/1iBFo1XfoJUwdALVxhDhIXvNGprGxzXZI?usp=drive_link)**. These statistics can serve as a practical reference when choosing an EE-error threshold and deciding whether to filter out episodes with large replay error for your use case. ## IOU Statistics Another useful signal for filtering is the **mask IOU** between the simulator mask and the SAM mask. This is only relevant for **Bridge** and **RT-1 Robot Actions**, where camera poses are **not fixed**. To build these datasets, we **sample camera positions/orientations** and then, for each episode, select one or a few viewpoints that best match the sampled cameras (We use the IOU between the entire robot's simulation and SAM mask for Bridge, and the IOU between the gripper's simulation and SAM mask for RT-1 Robot Actions). As a result, the viewpoint match can be less precise than datasets with fixed cameras. We provide **per-episode IOU summaries** (min / max / mean) that you can use to filter low-quality episodes. As a rule of thumb, we recommend: * **Bridge:** filter episodes with mean IOU < **0.6** * **RT-1 Robot Actions:** filter episodes with mean IOU < **0.4** Our IOU parquet files can be downloaded **[here](https://drive.google.com/drive/folders/1Gv6mK-UO7b45KVZgfMSWHT6kTI4vqsQH?usp=drive_link)**. Here is an example snippet showing how to filter episodes using the IOU threshold. ```python #!/usr/bin/env python3 from pathlib import Path import pyarrow.parquet as pq PARQUET_PATH = Path("/path/to/iou_episode_stats.parquet") # update this path THRESH = 0.6 # e.g., 0.6 for Dataset A, 0.4 for Dataset B df = pq.read_table(PARQUET_PATH, columns=["episode_id", "iou_mean"]).to_pandas() bad = df[df["iou_mean"] < THRESH] print(f"episodes with iou_mean < {THRESH}: {len(bad)}") print(bad.to_string(index=False)) ``` --- ## 🗓 Updates * {{2025-11}} — Released **All 16 augmented datasets** --- ## 📚 Citation If you use OXE-AugE datasets or tools, please cite: ```bibtex @misc{ ji2025oxeauge, title = {OXE-AugE: A Large-Scale Robot Augmentation of OXE for Scaling Cross-Embodiment Policy Learning}, author = {Ji, Guanhua and Polavaram, Harsha and Chen, Lawrence Yunliang and Bajamahal, Sandeep and Ma, Zehan and Adebola, Simeon and Xu, Chenfeng and Goldberg, Ken}, journal={arXiv preprint arXiv:2512.13100}, year={2025} } ``` Also cite upstream datasets you rely on (see per-shard READMEs for references). --- ## 🪪 License & Responsible Use * **Datasets:** CC BY 4.0. * **Code:** Apache-2.0 / MIT. * **Responsible Use:** No personal data; research/robotics use; do not deploy in unlawful or harmful contexts. --- ## 🤝 Contribute & Contact * Contribute new shards or fixes via Issues/PRs on the corresponding dataset repos. * For questions or issues, please reach out to Guanhua Ji: **[jgh1013@seas.upenn.edu](jgh1013@seas.upenn.edu)** * Organization home: [https://huggingface.co/oxe-auge](https://huggingface.co/oxe-auge)