pi05_real_pk_mixed
Fine-tuned pi0.5 vision-language-action (VLA) model for real robot manipulation.
Task
- Task: Pass Knife
- Training data: Mixed modes (left-hand + right-hand + sharp-end)
- Dataset:
real_pass_knife_mixed - Robot: Franka Panda (7-DOF)
- Cameras: Base RGB + Wrist RGB (256x256)
Training Configuration
| Parameter | Value |
|---|---|
| Base model | pi0.5 (PaliGemma 2B + Gemma 2B action expert) |
| Total parameters | ~3.35B |
| Action dimension | 32 |
| Action horizon | 10 |
| Batch size | 16 |
| Training steps | 5,000 |
| Learning rate | Cosine decay: warmup=500, peak=5e-5, end=5e-6 |
| Optimizer | AdamW (gradient clip norm=1.0) |
| Base weights | gs://openpi-assets/checkpoints/pi05_base/params |
| GPUs | 8x NVIDIA A100 |
| Normalization | Quantile normalization |
Included Checkpoints
- Step 4500: loss = 0.0037
- Step 4999: loss = 0.0035
Loss Curve
| Step | Loss |
|---|---|
| 0 | 0.0837 |
| 500 | 0.0155 |
| 1000 | 0.0133 |
| 1500 | 0.0107 |
| 2000 | 0.0084 |
| 2500 | 0.0078 |
| 3000 | 0.0068 |
| 3500 | 0.0055 |
| 4000 | 0.0043 |
| 4500 | 0.0037 |
| 4900 | 0.0035 |
Usage with openpi
# Add config name to openpi training config, then:
from openpi.training.config import get_config
config = get_config("pi05_real_pk_mixed")
# For inference, load the params checkpoint:
# checkpoint_path = "path/to/step_XXXX/params"
Part of Mode Editing Research
This checkpoint is part of the "Don't Filter Your Data, Edit Your Policy" project (CoRL 2026), investigating post-hoc behavior mode editing for robot policies using Classifier-Guided Distillation (CG-Distill).
- Mixed models are trained on demonstrations containing all behavioral modes
- Mode-specific models are trained on single-mode filtered data
- CG-Distill edited models (coming soon) use classifier gradients to steer mixed models toward specific modes at zero inference cost