jadechoghari commited on
Commit
26b99b9
·
verified ·
1 Parent(s): e3a5218

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +129 -40
README.md CHANGED
@@ -1,60 +1,149 @@
1
- # π₀ (Pi0)
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- These weights directly come from the Pytorch conversion script of openpi and their `pi0_base` model.
4
 
5
- π₀ is a **Vision-Language-Action model for general robot control**, from Physical Intelligence. The LeRobot implementation is adapted from their open source [OpenPI](https://github.com/Physical-Intelligence/openpi) repository.
6
 
7
- ## Model Overview
8
 
9
- π₀ represents a breakthrough in robotics as the first general-purpose robot foundation model developed by [Physical Intelligence](https://www.physicalintelligence.company/blog/pi0). Unlike traditional robots that are narrow specialists programmed for repetitive motions, π₀ is designed to be a generalist policy that can understand visual inputs, interpret natural language instructions, and control a variety of different robots across diverse tasks.
 
 
10
 
11
- ### Architecture and Approach
12
 
13
- π₀ combines several key innovations:
14
 
15
- - **Flow Matching**: Uses a novel method to augment pre-trained VLMs with continuous action outputs via flow matching (a variant of diffusion models)
16
- - **Cross-Embodiment Training**: Trained on data from 8 distinct robot platforms including UR5e, Bimanual UR5e, Franka, Bimanual Trossen, Bimanual ARX, Mobile Trossen, and Mobile Fibocom
17
- - **Internet-Scale Pre-training**: Inherits semantic knowledge from a pre-trained 3B parameter Vision-Language Model
18
- - **High-Frequency Control**: Outputs motor commands at up to 50 Hz for real-time dexterous manipulation
 
19
 
20
- ## Training
21
 
22
- For training π₀, you can use the standard LeRobot training script with the appropriate configuration:
 
 
23
 
24
  ```bash
25
- python src/lerobot/scripts/train.py \
26
- --dataset.repo_id=your_dataset \
27
- --policy.type=pi0 \
28
- --output_dir=./outputs/pi0_training \
29
- --job_name=pi0_training \
30
- --policy.pretrained_path=pepijn223/pi0_base \
31
- --policy.repo_id=your_repo_id \
32
- --policy.compile_model=true \
33
- --policy.gradient_checkpointing=true \
34
- --policy.dtype=bfloat16 \
35
- --steps=3000 \
36
- --policy.scheduler_decay_steps=3000 \
37
- --policy.device=cuda \
38
- --batch_size=32
39
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
- ## Citation
 
42
 
43
- If you use this model, please cite the original OpenPI work:
 
 
44
 
45
- ```bibtex
46
- @article{openpi2024,
47
- title={Open-World Robotic Manipulation with Vision-Language-Action Models},
48
- author={Physical Intelligence},
49
- year={2024},
50
- url={https://github.com/Physical-Intelligence/openpi}
51
- }
 
 
 
52
  ```
53
 
54
- ## Original Repository
55
 
56
- [OpenPI GitHub Repository](https://github.com/Physical-Intelligence/openpi)
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
- ## License
 
 
 
59
 
60
- This model follows the same license as the original OpenPI repository.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: lerobot
5
+ pipeline_tag: robotics
6
+ tags:
7
+ - vision-language-action
8
+ - imitation-learning
9
+ - lerobot
10
+ inference: false
11
+ license: gemma
12
+ ---
13
 
14
+ # π₀ (Pi0) (LeRobot)
15
 
16
+ π₀ is a Vision-Language-Action (VLA) foundation model from Physical Intelligence that jointly reasons over vision, language, and actions to control robots, serving as the base architecture that later enabled π₀.₅’s open-world generalization.
17
 
 
18
 
19
+ **Original paper:** π0: A Vision-Language-Action Flow Model for General Robot Controlion
20
+ **Reference implementation:** https://github.com/Physical-Intelligence/openpi
21
+ **LeRobot implementation:** Follows the original reference code for compatibility.
22
 
 
23
 
24
+ ## Model description
25
 
26
+ - **Inputs:** images (multi-view), proprio/state, optional language instruction
27
+ - **Outputs:** continuous actions
28
+ - **Training objective:** flow matching
29
+ - **Action representation:** continuous
30
+ - **Intended use:** Base model to fine tune on your specific use case
31
 
 
32
 
33
+ ## Quick start (inference on a real batch)
34
+
35
+ ### Installation
36
 
37
  ```bash
38
+ pip install "lerobot[pi]@git+https://github.com/huggingface/lerobot.git"
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ```
40
+ For full installation details (including optional video dependencies such as ffmpeg for torchcodec), see the official documentation: https://huggingface.co/docs/lerobot/installation
41
+
42
+ ### Load model + dataset, run `select_action`
43
+
44
+ ```python
45
+ import torch
46
+ from lerobot.datasets.lerobot_dataset import LeRobotDataset
47
+ from lerobot.policies.factory import make_pre_post_processors
48
+
49
+ # Swap this import per-policy
50
+ from lerobot.policies.pi0 import PI0Policy
51
+
52
+ # load a policy
53
+ model_id = "lerobot/pi0_base" # <- swap checkpoint
54
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
55
+
56
+ policy = PI0Policy.from_pretrained(model_id).to(device).eval()
57
+
58
+ preprocess, postprocess = make_pre_post_processors(
59
+ policy.config,
60
+ model_id,
61
+ preprocessor_overrides={"device_processor": {"device": str(device)}},
62
+ )
63
+ # load a lerobotdataset
64
+ dataset = LeRobotDataset("lerobot/libero")
65
 
66
+ # pick an episode
67
+ episode_index = 0
68
 
69
+ # each episode corresponds to a contiguous range of frame indices
70
+ from_idx = dataset.meta.episodes["dataset_from_index"][episode_index]
71
+ to_idx = dataset.meta.episodes["dataset_to_index"][episode_index]
72
 
73
+ # get a single frame from that episode (e.g. the first frame)
74
+ frame_index = from_idx
75
+ frame = dict(dataset[frame_index])
76
+
77
+ batch = preprocess(sample)
78
+ with torch.inference_mode():
79
+ pred_action = policy.select_action(frame)
80
+ # use your policy postprocess, this post process the action
81
+ # for instance unnormalize the actions, detokenize it etc..
82
+ pred_action = postprocess(pred_action)
83
  ```
84
 
 
85
 
86
+ ## Training step (loss + backward)
87
+
88
+ If you’re training / fine-tuning, you typically call `forward(...)` to get a loss and then:
89
+
90
+ ```python
91
+ policy.train()
92
+ batch = dict(dataset[0])
93
+ batch = preprocess(batch)
94
+
95
+ loss, outputs = policy.forward(batch)
96
+ loss.backward()
97
+
98
+ ```
99
 
100
+ > Notes:
101
+ >
102
+ > - Some policies expose `policy(**batch)` or return a dict; keep this snippet aligned with the policy API.
103
+ > - Use your trainer script (`lerobot-train`) for full training loops.
104
 
105
+
106
+ ## How to train / fine-tune
107
+
108
+ ```bash
109
+ lerobot-train \
110
+ --dataset.repo_id=${HF_USER}/<dataset> \
111
+ --output_dir=./outputs/[RUN_NAME] \
112
+ --job_name=[RUN_NAME] \
113
+ --policy.repo_id=${HF_USER}/<desired_policy_repo_id> \
114
+ --policy.path=lerobot/[BASE_CHECKPOINT] \
115
+ --policy.dtype=bfloat16 \
116
+ --policy.device=cuda \
117
+ --steps=100000 \
118
+ --batch_size=4
119
+ ```
120
+
121
+ Add policy-specific flags below:
122
+
123
+ - `-policy.chunk_size=...`
124
+ - `-policy.n_action_steps=...`
125
+ - `-policy.max_action_tokens=...`
126
+ - `-policy.gradient_checkpointing=true`
127
+
128
+
129
+ ## Real-World Inference & Evaluation
130
+
131
+ You can use the `record` script from [**`lerobot-record`**](https://github.com/huggingface/lerobot/blob/main/src/lerobot/scripts/lerobot_record.py) with a policy checkpoint as input, to run inference and evaluate your policy.
132
+
133
+ For instance, run this command or API example to run inference and record 10 evaluation episodes:
134
+
135
+ ```
136
+ lerobot-record \
137
+ --robot.type=so100_follower \
138
+ --robot.port=/dev/ttyACM1 \
139
+ --robot.cameras="{ up: {type: opencv, index_or_path: /dev/video10, width: 640, height: 480, fps: 30}, side: {type: intelrealsense, serial_number_or_name: 233522074606, width: 640, height: 480, fps: 30}}" \
140
+ --robot.id=my_awesome_follower_arm \
141
+ --display_data=false \
142
+ --dataset.repo_id=${HF_USER}/eval_so100 \
143
+ --dataset.single_task="Put lego brick into the transparent box" \
144
+ # <- Teleop optional if you want to teleoperate in between episodes \
145
+ # --teleop.type=so100_leader \
146
+ # --teleop.port=/dev/ttyACM0 \
147
+ # --teleop.id=my_awesome_leader_arm \
148
+ --policy.path=${HF_USER}/my_policy
149
+ ```