spikefly
/

pi05_prgvla_sorting

+---
+license: apache-2.0
+tags:
+  - robotics
+  - vla
+  - openpi
+  - pi0.5
+  - franka
+library_name: openpi
+---
+# pi0.5 fine-tuned on `zhuoKCL/prgvla_sorting`
+Single-task pi0.5 (JAX) fine-tune for **vegetable sorting** on a Franka.
+Trained from `gs://openpi-assets/checkpoints/pi05_droid/params` for 60 000 steps using `openpi`'s `pi05_droid_finetune` recipe with our own re-computed norm stats.
+- Step: **59 999** (final)
+- Final loss: ~0.001 (flow-matching MSE)
+- Action: `(horizon=16, dim=32)` — pi0.5 standard
+- State: 8-dim Franka (joint_position 7 + gripper 1)
+- Cameras (from DROID layout):
+  - `base_0_rgb` ← `ext_1`
+  - `left_wrist_0_rgb` ← `wrist`
+  - `right_wrist_0_rgb` ← zeros (mask=False)
+- Prompt: per-episode natural-language synonym from `tasks.jsonl` (NO fixed phrase)
+## Files
+| Path | Purpose |
+|---|---|
+| `params/` | orbax checkpoint, JAX params (12 GB) |
+| `assets/zhuoKCL/prgvla_sorting/norm_stats.json` | q01/q99 quantile norm stats (pi05 standard) |
+| `norm_stats.json` | same file copied to root for quick inspection |
+`train_state/` (optimizer state, ~30 GB) is **not** included — inference does not need it.
+## Use it from `openpi`
+In your local copy of `openpi`, edit `src/openpi/training/config.py` → the `pi05_droid_finetune` entry:
+```python
+TrainConfig(
+    name="pi05_droid_finetune",
+    model=pi0_config.Pi0Config(pi05=True, action_dim=32, action_horizon=16),
+    data=LeRobotDROIDDataConfig(
+        repo_id="zhuoKCL/prgvla_sorting",          # 1) was: lerobot's droid repo
+        base_config=DataConfig(prompt_from_task=True),
+        # 2) remove / comment out any AssetsConfig(asset_id="droid", ...) line
+    ),
+    weight_loader=weight_loaders.CheckpointWeightLoader(
+        "<path-to-this-repo>/params"               # or huggingface-cli download
+    ),
+    num_train_steps=60_000,
+),
+```
+Then run inference exactly as upstream does.
+## Inference contract (observation dict)
+```python
+obs = {
+    "observation/joint_position": np.ndarray(7,),
+    "observation/gripper_position": float,                 # scalar, becomes 1d
+    "observation/exterior_image_1_left": uint8 H×W×3,      # → base_0_rgb
+    "observation/wrist_image_left":      uint8 H×W×3,      # → left_wrist_0_rgb
+    "prompt": "<natural language sentence>",
+}
+```
+## Training data
+[`zhuoKCL/prgvla_sorting`](https://huggingface.co/datasets/zhuoKCL/prgvla_sorting) (LeRobot v2.1).