CoRL2026-CSI
/

pi05_teleop_close_pot

@@ -5,180 +5,75 @@ pipeline_tag: robotics
 tags:
   - lerobot
   - robotics
-  - vla
-  - pi0
   - pi05
   - so101
-  - manipulation
   - imitation-learning
-  - behavior-cloning
 datasets:
   - CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi
 base_model: lerobot/pi05_base
-language:
-  - en
-model-index:
-  - name: pi05_close_pot
-    results: []
 ---
 # π0.5 — SO-101 `close_pot_lid`
-`lerobot/pi05_base` 를 SO-101 양팔(top + left wrist) 카메라 셋업에서
-**냄비 뚜껑 닫기(`close_pot_lid`)** 단일 태스크에 대해 100 에피소드(57,173 프레임)
-원격조작 데모로 파인튜닝한 π0.5 (PaliGemma-2B + Action Expert 300M) 정책입니다.
-학습 코드: [`scripts/train_pi05_close_pot_lid.sh`](https://github.com/HyeonseokE/train_with_lerobot/blob/main/scripts/train_pi05_close_pot_lid.sh)
-프레임워크: [LeRobot](https://github.com/huggingface/lerobot)
----
-## 모델 개요
-| 항목 | 값 |
-|---|---|
-| Architecture | π0.5 (PaliGemma-2B VLM + Gemma-300M action expert, flow-matching head) |
-| Base checkpoint | [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base) |
-| Action chunk size | 50 |
-| Inference steps (flow-matching) | 10 |
-| Image resolution | 224 × 224 |
-| Cameras | `base_0_rgb`, `left_wrist_0_rgb`, `right_wrist_0_rgb` |
-| State dim (padded) | 32 |
-| Action dim (실효 / padded) | **6** / 32 |
-| dtype | bfloat16 |
-### 액션 / 카메라 매핑
-데이터셋 → 정책 입력 키 rename:
-```
-observation.images.top   →  observation.images.base_0_rgb
-observation.images.wrist →  observation.images.left_wrist_0_rgb
-```
-> `right_wrist_0_rgb` 는 모델 입력 슬롯이지만 SO-101 단일팔에서는 빈 카메라로 처리됩니다.
-액션 피처(6 DoF, SO-101):
 ```
-shoulder_pan.pos
-shoulder_lift.pos
-elbow_flex.pos
-wrist_flex.pos
-wrist_roll.pos
-gripper.pos
 ```
-정규화: `ACTION = MEAN_STD`, `STATE = MEAN_STD`, `VISUAL = IDENTITY`.
----
-## 학습 데이터
-- **데이터셋**: [`CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi`](https://huggingface.co/datasets/CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi)
-- **에피소드**: 100
-- **총 프레임**: 57,173
-- **로봇 / 태스크**: SO-101, 냄비 뚜껑을 잡아 본체 위에 닫기
-- **수집 방식**: human teleoperation
-- **카메라**: top + wrist (둘 다 224 × 224 으로 리사이즈)
-### Image augmentation (학습 시)
-`max_num_transforms=3`, `random_order=True`, 후보:
-| 변환 | 파라미터 |
 |---|---|
-| ColorJitter brightness | `[0.8, 1.2]` |
-| ColorJitter contrast | `[0.8, 1.2]` |
-| ColorJitter saturation | `[0.5, 1.5]` |
-| ColorJitter hue | `[-0.05, 0.05]` |
-| SharpnessJitter | `[0.5, 1.5]` |
-| RandomAffine | degrees `[-5, 5]`, translate `[0.05, 0.05]` |
----
-## 학습 설정
-| 항목 | 값 |
-|---|---|
-| Hardware | 4 × GPU (DDP via 🤗 Accelerate) |
-| Per-device batch size | 32 |
 | Gradient accumulation | 2 |
-| **Effective global batch** | **256** (32 × 4 × 2) |
-| Steps | 11,200 |
-| ≈ Epochs | 50 (`57,173 × 50 / 256 ≈ 11,167`) |
-| Optimizer | AdamW (β=(0.9, 0.95), eps=1e-8, wd=0.01) |
-| Peak LR | 2.5e-5 |
-| Decay LR | 2.5e-6 |
-| Scheduler | cosine decay, warmup 1000, decay 30000 |
-| Grad clip | 1.0 |
-| Mixed precision | none (bf16 native) |
 | Gradient checkpointing | on |
-| `compile_model` | off |
-| `freeze_vision_encoder` | off |
-| `train_expert_only` | off |
 | Seed | 1000 |
-체크포인트: 11,200 step (학습 종료 시점).
----
-## 사용 방법
-### 1. 모델 로드
 ```python
 from lerobot.policies.pi05.modeling_pi05 import PI05Policy
-policy = PI05Policy.from_pretrained("CoRL2026-CSI/pi05_close_pot")
-policy.eval().to("cuda")
 ```
-### 2. 추론 (전처리/후처리 파이프라인 포함)
-LeRobot의 표준 inference 스크립트를 사용하세요:
 ```bash
-lerobot-eval \
-    --policy.path=CoRL2026-CSI/pi05_close_pot \
-    --env.type=<your_env> \
-    --eval.n_episodes=20
 ```
-또는 실시간 로봇 제어용 스크립트는 저장소
-[`scripts/infer_smolvla.py`](https://github.com/HyeonseokE/train_with_lerobot/blob/main/scripts/infer_smolvla.py) 와 동일한 패턴을
-참조해 `pi05` 로 교체해 사용할 수 있습니다.
-### 3. 카메라 키 주의
-학습 시 데이터셋의 `observation.images.top` / `.wrist` 가 정책 입력
-`base_0_rgb` / `left_wrist_0_rgb` 로 rename 되었습니다. 다른 환경에서
-사용 시 동일한 키로 변환하거나 `--rename_map` 인자를 사용하세요.
----
-## 한계 및 권고
-- **단일 태스크 / 단일 시드**: `close_pot_lid` 100 에피소드 외 분포에서는 일반화 보장 없음.
-- **단일팔(SO-101) 전제**: `right_wrist_0_rgb` 는 빈 카메라로 학습되어 다른 양팔 셋업에서는 재학습 필요.
-- **카메라 위치/조명 민감도**: 100 에피소드 + image aug 만으로 학습 — 큰 도메인 시프트에서는 성능 저하 가능.
-- **정량 평가 미수록**: 본 카드에는 실로봇 / 시뮬 success rate 가 포함되어 있지 않습니다. 사용 전 자체 평가 권장.
----
-## 라이선스
-Apache 2.0 (베이스 모델 [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base) 라이선스를 따릅니다).
-## 인용
-LeRobot 프로젝트:
-```bibtex
-@misc{cadene2024lerobot,
-    author = {Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Wolf, Thomas},
-    title = {LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch},
-    howpublished = "\url{https://github.com/huggingface/lerobot}",
-    year = {2024}
-}
-```

 tags:
   - lerobot
   - robotics
   - pi05
   - so101
   - imitation-learning
 datasets:
   - CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi
 base_model: lerobot/pi05_base
 ---
 # π0.5 — SO-101 `close_pot_lid`
+Fine-tuned [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base) on 100 teleop episodes of the SO-101 `close_pot_lid` task.
+## Model
+- **Architecture**: π0.5 (PaliGemma-2B VLM + Gemma-300M action expert, flow matching, 10 inference steps)
+- **Cameras**: `base_0_rgb`, `left_wrist_0_rgb`, `right_wrist_0_rgb` (224×224)
+- **State / Action dim**: 32 (padded) / 6 (SO-101)
+- **Action chunk**: 50
+- **dtype**: bfloat16
+Camera key rename (dataset → policy):
 ```
+observation.images.top   → observation.images.base_0_rgb
+observation.images.wrist → observation.images.left_wrist_0_rgb
 ```
+`right_wrist_0_rgb` is an empty camera slot for this single-arm setup.
+Action features (SO-101): `shoulder_pan, shoulder_lift, elbow_flex, wrist_flex, wrist_roll, gripper` (`.pos`).
+Normalization: `ACTION/STATE = MEAN_STD`, `VISUAL = IDENTITY`.
+## Data
+[`CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi`](https://huggingface.co/datasets/CoRL2026-CSI/SO101-teleop_close_pot_lid_100epi) — 100 episodes, 57,173 frames, human teleop.
+## Training
+| | |
 |---|---|
+| Hardware | 4 × GPU (DDP, 🤗 Accelerate) |
+| Per-device batch | 32 |
 | Gradient accumulation | 2 |
+| Effective global batch | 256 |
+| Steps | 11,200 (~50 epochs) |
+| Optimizer | AdamW, β=(0.9, 0.95), wd=0.01, grad clip 1.0 |
+| LR | cosine decay, peak 2.5e-5 → 2.5e-6, warmup 1000, decay 30000 |
 | Gradient checkpointing | on |
+| Image aug | ColorJitter (brightness/contrast/saturation/hue), SharpnessJitter, RandomAffine — `max_num=3`, random order |
 | Seed | 1000 |
+Training script: [`scripts/train_pi05_close_pot_lid.sh`](https://github.com/HyeonseokE/train_with_lerobot/blob/main/scripts/train_pi05_close_pot_lid.sh).
+## Usage
 ```python
 from lerobot.policies.pi05.modeling_pi05 import PI05Policy
+policy = PI05Policy.from_pretrained("CoRL2026-CSI/pi05_close_pot").to("cuda").eval()
 ```
 ```bash
+lerobot-eval --policy.path=CoRL2026-CSI/pi05_close_pot --env.type=<env> --eval.n_episodes=20
 ```
+## Limitations
+- Single task, single seed; no quantitative success rate reported here.
+- Trained on a single-arm SO-101; the right-wrist camera slot is empty.
+- 100 episodes only — sensitive to camera/lighting domain shift.
+## License
+Apache 2.0 (inherits from [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base)).