Ο0.5 β SO-101 sort_blocks
Fine-tuned lerobot/pi05_base on 100 teleop episodes of the SO-101 sort_blocks task.
Model
- Architecture: Ο0.5 (PaliGemma-2B VLM + Gemma-300M action expert, flow matching, 10 inference steps)
- Cameras:
base_0_rgb,left_wrist_0_rgb,right_wrist_0_rgb(224Γ224) - State / Action dim: 32 (padded) / 6 (SO-101)
- Action chunk: 50
- dtype: bfloat16
Camera key rename (dataset β policy):
observation.images.top β observation.images.base_0_rgb
observation.images.wrist β observation.images.left_wrist_0_rgb
right_wrist_0_rgb is an empty camera slot for this single-arm setup.
Action features (SO-101): shoulder_pan, shoulder_lift, elbow_flex, wrist_flex, wrist_roll, gripper (.pos).
Normalization: ACTION/STATE = MEAN_STD, VISUAL = IDENTITY.
Data
CoRL2026-CSI/SO101-teleop_sort_blocks_100epi β 100 episodes, 94,568 frames, 30 fps, human teleop.
Training
| Hardware | 4 Γ GPU (DDP, π€ Accelerate) |
| Per-device batch | 32 |
| Gradient accumulation | 2 |
| Effective global batch | 256 |
| Steps | 18,500 (~50 epochs) |
| Optimizer | AdamW, Ξ²=(0.9, 0.95), wd=0.01, grad clip 1.0 |
| LR | cosine decay, peak 2.5e-5 β 2.5e-6, warmup 1000, decay 30000 |
| Gradient checkpointing | on |
| Image aug | ColorJitter (brightness/contrast/saturation/hue), SharpnessJitter, RandomAffine β max_num=3, random order |
| Seed | 1000 |
Training script: scripts/train_pi05_sort_blocks.sh.
Usage
from lerobot.policies.pi05.modeling_pi05 import PI05Policy
policy = PI05Policy.from_pretrained("CoRL2026-CSI/pi05_teleop_sort_block").to("cuda").eval()
lerobot-eval --policy.path=CoRL2026-CSI/pi05_teleop_sort_block --env.type=<env> --eval.n_episodes=20
Limitations
- Single task, single seed; no quantitative success rate reported here.
- Trained on a single-arm SO-101; the right-wrist camera slot is empty.
- 100 episodes only β sensitive to camera/lighting domain shift.
License
Apache 2.0 (inherits from lerobot/pi05_base).
- Downloads last month
- -
Model tree for CoRL2026-CSI/pi05_teleop_sort_block
Base model
lerobot/pi05_base