pi0.5 fine-tuned on zhuoKCL/prgvla_sorting
Single-task pi0.5 (JAX) fine-tune for vegetable sorting on a Franka.
Trained from gs://openpi-assets/checkpoints/pi05_droid/params for 60 000 steps using openpi's pi05_droid_finetune recipe with our own re-computed norm stats.
- Step: 59 999 (final)
- Final loss: ~0.001 (flow-matching MSE)
- Action:
(horizon=16, dim=32)— pi0.5 standard - State: 8-dim Franka (joint_position 7 + gripper 1)
- Cameras (from DROID layout):
base_0_rgb←ext_1left_wrist_0_rgb←wristright_wrist_0_rgb← zeros (mask=False)
- Prompt: per-episode natural-language synonym from
tasks.jsonl(NO fixed phrase)
Files
| Path | Purpose |
|---|---|
params/ |
orbax checkpoint, JAX params (12 GB) |
assets/zhuoKCL/prgvla_sorting/norm_stats.json |
q01/q99 quantile norm stats (pi05 standard) |
norm_stats.json |
same file copied to root for quick inspection |
train_state/ (optimizer state, ~30 GB) is not included — inference does not need it.
Use it from openpi
In your local copy of openpi, edit src/openpi/training/config.py → the pi05_droid_finetune entry:
TrainConfig(
name="pi05_droid_finetune",
model=pi0_config.Pi0Config(pi05=True, action_dim=32, action_horizon=16),
data=LeRobotDROIDDataConfig(
repo_id="zhuoKCL/prgvla_sorting", # 1) was: lerobot's droid repo
base_config=DataConfig(prompt_from_task=True),
# 2) remove / comment out any AssetsConfig(asset_id="droid", ...) line
),
weight_loader=weight_loaders.CheckpointWeightLoader(
"<path-to-this-repo>/params" # or huggingface-cli download
),
num_train_steps=60_000,
),
Then run inference exactly as upstream does.
Inference contract (observation dict)
obs = {
"observation/joint_position": np.ndarray(7,),
"observation/gripper_position": float, # scalar, becomes 1d
"observation/exterior_image_1_left": uint8 H×W×3, # → base_0_rgb
"observation/wrist_image_left": uint8 H×W×3, # → left_wrist_0_rgb
"prompt": "<natural language sentence>",
}
Training data
zhuoKCL/prgvla_sorting (LeRobot v2.1).