|
|
--- |
|
|
license: apache-2.0 |
|
|
pipeline_tag: robotics |
|
|
library_name: lerobot |
|
|
--- |
|
|
|
|
|
# Οβ (Pi0) |
|
|
|
|
|
These weights directly come from the Pytorch conversion script of openpi and their `pi0_base` model. |
|
|
|
|
|
Οβ is a **Vision-Language-Action model for general robot control**, from Physical Intelligence. The LeRobot implementation is adapted from their open source [OpenPI](https://github.com/Physical-Intelligence/openpi) repository. |
|
|
|
|
|
--- |
|
|
**Paper:** [Robot Learning: A Tutorial](https://huggingface.co/papers/2510.12403) |
|
|
**Abstract:** Robot learning is at an inflection point, driven by rapid advancements in machine learning and the growing availability of large-scale robotics data. This shift from classical, model-based methods to data-driven, learning-based paradigms is unlocking unprecedented capabilities in autonomous systems. This tutorial navigates the landscape of modern robot learning, charting a course from the foundational principles of Reinforcement Learning and Behavioral Cloning to generalist, language-conditioned models capable of operating across diverse tasks and even robot embodiments. This work is intended as a guide for researchers and practitioners, and our goal is to equip the reader with the conceptual understanding and practical tools necessary to contribute to developments in robot learning, with ready-to-use examples implemented in `lerobot`. |
|
|
**Project Page:** [https://huggingface.co/spaces/lerobot/robot-learning-tutorial](https://huggingface.co/spaces/lerobot/robot-learning-tutorial) |
|
|
**Code for Tutorial:** [https://github.com/fracapuano/robot-learning-tutorial](https://github.com/fracapuano/robot-learning-tutorial) |
|
|
**Original Repository (OpenPI):** [https://github.com/Physical-Intelligence/openpi](https://github.com/Physical-Intelligence/openpi) |
|
|
--- |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
Οβ represents a breakthrough in robotics as the first general-purpose robot foundation model developed by [Physical Intelligence](https://www.physicalintelligence.company/blog/pi0). Unlike traditional robots that are narrow specialists programmed for repetitive motions, Οβ is designed to be a generalist policy that can understand visual inputs, interpret natural language instructions, and control a variety of different robots across diverse tasks. This model is featured as an example in the "Robot Learning: A Tutorial" paper. |
|
|
|
|
|
### Architecture and Approach |
|
|
|
|
|
Οβ combines several key innovations: |
|
|
|
|
|
- **Flow Matching**: Uses a novel method to augment pre-trained VLMs with continuous action outputs via flow matching (a variant of diffusion models) |
|
|
- **Cross-Embodiment Training**: Trained on data from 8 distinct robot platforms including UR5e, Bimanual UR5e, Franka, Bimanual Trossen, Bimanual ARX, Mobile Trossen, and Mobile Fibocom |
|
|
- **Internet-Scale Pre-training**: Inherits semantic knowledge from a pre-trained 3B parameter Vision-Language Model |
|
|
- **High-Frequency Control**: Outputs motor commands at up to 50 Hz for real-time dexterous manipulation |
|
|
|
|
|
## Training |
|
|
|
|
|
For training Οβ, you can use the standard LeRobot training script with the appropriate configuration: |
|
|
|
|
|
```bash |
|
|
python src/lerobot/scripts/train.py \ |
|
|
--dataset.repo_id=your_dataset \ |
|
|
--policy.type=pi0 \ |
|
|
--output_dir=./outputs/pi0_training \ |
|
|
--job_name=pi0_training \ |
|
|
--policy.pretrained_path=pepijn223/pi0_base \ |
|
|
--policy.repo_id=your_repo_id \ |
|
|
--policy.compile_model=true \ |
|
|
--policy.gradient_checkpointing=true \ |
|
|
--policy.dtype=bfloat16 \ |
|
|
--steps=3000 \ |
|
|
--policy.scheduler_decay_steps=3000 \ |
|
|
--policy.device=cuda \ |
|
|
--batch_size=32 |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite the original OpenPI work and the tutorial paper: |
|
|
|
|
|
```bibtex |
|
|
@article{openpi2024, |
|
|
title={Open-World Robotic Manipulation with Vision-Language-Action Models}, |
|
|
author={Physical Intelligence}, |
|
|
year={2024}, |
|
|
url={https://github.com/Physical-Intelligence/openpi} |
|
|
} |
|
|
|
|
|
@misc{tutorial2025robotlearning, |
|
|
title={Robot Learning: A Tutorial}, |
|
|
author={Francisco Cruz and Niels Rogge and Victor Dibia and Sasha Bozhkov and Thomas Wolf}, |
|
|
year={2025}, |
|
|
eprint={2510.12403}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.RO}, |
|
|
url={https://arxiv.org/abs/2510.12403}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model follows the same license as the original OpenPI repository, which is Apache 2.0. |