Robotics
LeRobot
Safetensors
pi05
so101
imitation-learning
HyeonseokE commited on
Commit
13f5974
·
verified ·
1 Parent(s): e801777

Update model card: concise English version

Browse files
Files changed (1) hide show
  1. README.md +34 -139
README.md CHANGED
@@ -5,180 +5,75 @@ pipeline_tag: robotics
5
  tags:
6
  - lerobot
7
  - robotics
8
- - vla
9
- - pi0
10
  - pi05
11
  - so101
12
- - manipulation
13
  - imitation-learning
14
- - behavior-cloning
15
  datasets:
16
  - CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi
17
  base_model: lerobot/pi05_base
18
- language:
19
- - en
20
- model-index:
21
- - name: pi05_close_pot
22
- results: []
23
  ---
24
 
25
  # π0.5 — SO-101 `close_pot_lid`
26
 
27
- `lerobot/pi05_base` SO-101 양팔(top + left wrist) 카메라 셋업에서
28
- **냄비 뚜껑 닫기(`close_pot_lid`)** 단일 태스크에 대해 100 에피소드(57,173 프레임)
29
- 원격조작 데모로 파인튜닝한 π0.5 (PaliGemma-2B + Action Expert 300M) 정책입니다.
30
 
31
- 학습 코드: [`scripts/train_pi05_close_pot_lid.sh`](https://github.com/HyeonseokE/train_with_lerobot/blob/main/scripts/train_pi05_close_pot_lid.sh)
32
- 프레임워크: [LeRobot](https://github.com/huggingface/lerobot)
33
 
34
- ---
35
-
36
- ## 모델 개요
37
-
38
- | 항목 | 값 |
39
- |---|---|
40
- | Architecture | π0.5 (PaliGemma-2B VLM + Gemma-300M action expert, flow-matching head) |
41
- | Base checkpoint | [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base) |
42
- | Action chunk size | 50 |
43
- | Inference steps (flow-matching) | 10 |
44
- | Image resolution | 224 × 224 |
45
- | Cameras | `base_0_rgb`, `left_wrist_0_rgb`, `right_wrist_0_rgb` |
46
- | State dim (padded) | 32 |
47
- | Action dim (실효 / padded) | **6** / 32 |
48
- | dtype | bfloat16 |
49
-
50
- ### 액션 / 카메라 매핑
51
-
52
- 데이터셋 → 정책 입력 키 rename:
53
-
54
- ```
55
- observation.images.top → observation.images.base_0_rgb
56
- observation.images.wrist → observation.images.left_wrist_0_rgb
57
- ```
58
-
59
- > `right_wrist_0_rgb` 는 모델 입력 슬롯이지만 SO-101 단일팔에서는 빈 카메라로 처리됩니다.
60
-
61
- 액션 피처(6 DoF, SO-101):
62
 
 
63
  ```
64
- shoulder_pan.pos
65
- shoulder_lift.pos
66
- elbow_flex.pos
67
- wrist_flex.pos
68
- wrist_roll.pos
69
- gripper.pos
70
  ```
 
71
 
72
- 정규화: `ACTION = MEAN_STD`, `STATE = MEAN_STD`, `VISUAL = IDENTITY`.
73
-
74
- ---
75
-
76
- ## 학습 데이터
77
 
78
- - **데이터셋**: [`CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi`](https://huggingface.co/datasets/CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi)
79
- - **에피소드**: 100
80
- - **총 프레임**: 57,173
81
- - **로봇 / 태스크**: SO-101, 냄비 뚜껑을 잡아 본체 위에 닫기
82
- - **수집 방식**: human teleoperation
83
- - **카메라**: top + wrist (둘 다 224 × 224 으로 리사이즈)
84
 
85
- ### Image augmentation (학습 시)
86
 
87
- `max_num_transforms=3`, `random_order=True`, 후보:
88
 
89
- | 변환 | 파라미터 |
90
  |---|---|
91
- | ColorJitter brightness | `[0.8, 1.2]` |
92
- | ColorJitter contrast | `[0.8, 1.2]` |
93
- | ColorJitter saturation | `[0.5, 1.5]` |
94
- | ColorJitter hue | `[-0.05, 0.05]` |
95
- | SharpnessJitter | `[0.5, 1.5]` |
96
- | RandomAffine | degrees `[-5, 5]`, translate `[0.05, 0.05]` |
97
-
98
- ---
99
-
100
- ## 학습 설정
101
-
102
- | 항목 | 값 |
103
- |---|---|
104
- | Hardware | 4 × GPU (DDP via 🤗 Accelerate) |
105
- | Per-device batch size | 32 |
106
  | Gradient accumulation | 2 |
107
- | **Effective global batch** | **256** (32 × 4 × 2) |
108
- | Steps | 11,200 |
109
- | Epochs | 50 (`57,173 × 50 / 256 11,167`) |
110
- | Optimizer | AdamW (β=(0.9, 0.95), eps=1e-8, wd=0.01) |
111
- | Peak LR | 2.5e-5 |
112
- | Decay LR | 2.5e-6 |
113
- | Scheduler | cosine decay, warmup 1000, decay 30000 |
114
- | Grad clip | 1.0 |
115
- | Mixed precision | none (bf16 native) |
116
  | Gradient checkpointing | on |
117
- | `compile_model` | off |
118
- | `freeze_vision_encoder` | off |
119
- | `train_expert_only` | off |
120
  | Seed | 1000 |
121
 
122
- 체크포인트: 11,200 step (학습 종료 시점).
123
-
124
- ---
125
 
126
- ## 사용 방법
127
-
128
- ### 1. 모델 로드
129
 
130
  ```python
131
  from lerobot.policies.pi05.modeling_pi05 import PI05Policy
132
 
133
- policy = PI05Policy.from_pretrained("CoRL2026-CSI/pi05_close_pot")
134
- policy.eval().to("cuda")
135
  ```
136
 
137
- ### 2. 추론 (전처리/후처리 파이프라인 포함)
138
-
139
- LeRobot의 표준 inference 스크립트를 사용하세요:
140
-
141
  ```bash
142
- lerobot-eval \
143
- --policy.path=CoRL2026-CSI/pi05_close_pot \
144
- --env.type=<your_env> \
145
- --eval.n_episodes=20
146
  ```
147
 
148
- 또는 실시간 로봇 제어용 스크립트는 저장소
149
- [`scripts/infer_smolvla.py`](https://github.com/HyeonseokE/train_with_lerobot/blob/main/scripts/infer_smolvla.py) 와 동일한 패턴을
150
- 참조해 `pi05` 로 교체해 사용할 수 있습니다.
151
-
152
- ### 3. 카메라 키 주의
153
 
154
- 학습 데이터셋의 `observation.images.top` / `.wrist` 정책 입력
155
- `base_0_rgb` / `left_wrist_0_rgb` rename 되었습니다. 다른 환경에서
156
- 사용 동일한 키로 변환하거나 `--rename_map` 인자를 사용하세요.
157
-
158
- ---
159
 
160
- ## 한계 및 권고
161
 
162
- - **단일 태스크 / 단일 시드**: `close_pot_lid` 100 에피소드 외 분포에서는 일반화 보장 없음.
163
- - **단일팔(SO-101) 전제**: `right_wrist_0_rgb` 는 빈 카메라로 학습되어 다른 양팔 셋업에서는 재학습 필요.
164
- - **카메라 위치/조명 민감도**: 100 에피소드 + image aug 만으로 학습 — 큰 도메인 시프트에서는 성능 저하 가능.
165
- - **정량 평가 미수록**: 본 카드에는 실로봇 / 시뮬 success rate 가 포함되어 있지 않습니다. 사용 전 자체 평가 권장.
166
-
167
- ---
168
-
169
- ## 라이선스
170
-
171
- Apache 2.0 (베이스 모델 [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base) 라이선스를 따릅니다).
172
-
173
- ## 인용
174
-
175
- LeRobot 프로젝트:
176
-
177
- ```bibtex
178
- @misc{cadene2024lerobot,
179
- author = {Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Wolf, Thomas},
180
- title = {LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch},
181
- howpublished = "\url{https://github.com/huggingface/lerobot}",
182
- year = {2024}
183
- }
184
- ```
 
5
  tags:
6
  - lerobot
7
  - robotics
 
 
8
  - pi05
9
  - so101
 
10
  - imitation-learning
 
11
  datasets:
12
  - CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi
13
  base_model: lerobot/pi05_base
 
 
 
 
 
14
  ---
15
 
16
  # π0.5 — SO-101 `close_pot_lid`
17
 
18
+ Fine-tuned [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base) on 100 teleop episodes of the SO-101 `close_pot_lid` task.
 
 
19
 
20
+ ## Model
 
21
 
22
+ - **Architecture**: π0.5 (PaliGemma-2B VLM + Gemma-300M action expert, flow matching, 10 inference steps)
23
+ - **Cameras**: `base_0_rgb`, `left_wrist_0_rgb`, `right_wrist_0_rgb` (224×224)
24
+ - **State / Action dim**: 32 (padded) / 6 (SO-101)
25
+ - **Action chunk**: 50
26
+ - **dtype**: bfloat16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
+ Camera key rename (dataset → policy):
29
  ```
30
+ observation.images.top → observation.images.base_0_rgb
31
+ observation.images.wrist → observation.images.left_wrist_0_rgb
 
 
 
 
32
  ```
33
+ `right_wrist_0_rgb` is an empty camera slot for this single-arm setup.
34
 
35
+ Action features (SO-101): `shoulder_pan, shoulder_lift, elbow_flex, wrist_flex, wrist_roll, gripper` (`.pos`).
36
+ Normalization: `ACTION/STATE = MEAN_STD`, `VISUAL = IDENTITY`.
 
 
 
37
 
38
+ ## Data
 
 
 
 
 
39
 
40
+ [`CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi`](https://huggingface.co/datasets/CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi) 100 episodes, 57,173 frames, human teleop.
41
 
42
+ ## Training
43
 
44
+ | | |
45
  |---|---|
46
+ | Hardware | 4 × GPU (DDP, 🤗 Accelerate) |
47
+ | Per-device batch | 32 |
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  | Gradient accumulation | 2 |
49
+ | Effective global batch | 256 |
50
+ | Steps | 11,200 (~50 epochs) |
51
+ | Optimizer | AdamW, β=(0.9, 0.95), wd=0.01, grad clip 1.0 |
52
+ | LR | cosine decay, peak 2.5e-5 → 2.5e-6, warmup 1000, decay 30000 |
 
 
 
 
 
53
  | Gradient checkpointing | on |
54
+ | Image aug | ColorJitter (brightness/contrast/saturation/hue), SharpnessJitter, RandomAffine — `max_num=3`, random order |
 
 
55
  | Seed | 1000 |
56
 
57
+ Training script: [`scripts/train_pi05_close_pot_lid.sh`](https://github.com/HyeonseokE/train_with_lerobot/blob/main/scripts/train_pi05_close_pot_lid.sh).
 
 
58
 
59
+ ## Usage
 
 
60
 
61
  ```python
62
  from lerobot.policies.pi05.modeling_pi05 import PI05Policy
63
 
64
+ policy = PI05Policy.from_pretrained("CoRL2026-CSI/pi05_close_pot").to("cuda").eval()
 
65
  ```
66
 
 
 
 
 
67
  ```bash
68
+ lerobot-eval --policy.path=CoRL2026-CSI/pi05_close_pot --env.type=<env> --eval.n_episodes=20
 
 
 
69
  ```
70
 
71
+ ## Limitations
 
 
 
 
72
 
73
+ - Single task, single seed; no quantitative success rate reported here.
74
+ - Trained on a single-arm SO-101; the right-wrist camera slot is empty.
75
+ - 100 episodes only sensitive to camera/lighting domain shift.
 
 
76
 
77
+ ## License
78
 
79
+ Apache 2.0 (inherits from [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base)).