Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- robotics
|
| 5 |
+
- vla
|
| 6 |
+
- openpi
|
| 7 |
+
- pi0.5
|
| 8 |
+
- franka
|
| 9 |
+
library_name: openpi
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# pi0.5 fine-tuned on `zhuoKCL/prgvla_sorting`
|
| 13 |
+
|
| 14 |
+
Single-task pi0.5 (JAX) fine-tune for **vegetable sorting** on a Franka.
|
| 15 |
+
Trained from `gs://openpi-assets/checkpoints/pi05_droid/params` for 60 000 steps using `openpi`'s `pi05_droid_finetune` recipe with our own re-computed norm stats.
|
| 16 |
+
|
| 17 |
+
- Step: **59 999** (final)
|
| 18 |
+
- Final loss: ~0.001 (flow-matching MSE)
|
| 19 |
+
- Action: `(horizon=16, dim=32)` — pi0.5 standard
|
| 20 |
+
- State: 8-dim Franka (joint_position 7 + gripper 1)
|
| 21 |
+
- Cameras (from DROID layout):
|
| 22 |
+
- `base_0_rgb` ← `ext_1`
|
| 23 |
+
- `left_wrist_0_rgb` ← `wrist`
|
| 24 |
+
- `right_wrist_0_rgb` ← zeros (mask=False)
|
| 25 |
+
- Prompt: per-episode natural-language synonym from `tasks.jsonl` (NO fixed phrase)
|
| 26 |
+
|
| 27 |
+
## Files
|
| 28 |
+
|
| 29 |
+
| Path | Purpose |
|
| 30 |
+
|---|---|
|
| 31 |
+
| `params/` | orbax checkpoint, JAX params (12 GB) |
|
| 32 |
+
| `assets/zhuoKCL/prgvla_sorting/norm_stats.json` | q01/q99 quantile norm stats (pi05 standard) |
|
| 33 |
+
| `norm_stats.json` | same file copied to root for quick inspection |
|
| 34 |
+
|
| 35 |
+
`train_state/` (optimizer state, ~30 GB) is **not** included — inference does not need it.
|
| 36 |
+
|
| 37 |
+
## Use it from `openpi`
|
| 38 |
+
|
| 39 |
+
In your local copy of `openpi`, edit `src/openpi/training/config.py` → the `pi05_droid_finetune` entry:
|
| 40 |
+
|
| 41 |
+
```python
|
| 42 |
+
TrainConfig(
|
| 43 |
+
name="pi05_droid_finetune",
|
| 44 |
+
model=pi0_config.Pi0Config(pi05=True, action_dim=32, action_horizon=16),
|
| 45 |
+
data=LeRobotDROIDDataConfig(
|
| 46 |
+
repo_id="zhuoKCL/prgvla_sorting", # 1) was: lerobot's droid repo
|
| 47 |
+
base_config=DataConfig(prompt_from_task=True),
|
| 48 |
+
# 2) remove / comment out any AssetsConfig(asset_id="droid", ...) line
|
| 49 |
+
),
|
| 50 |
+
weight_loader=weight_loaders.CheckpointWeightLoader(
|
| 51 |
+
"<path-to-this-repo>/params" # or huggingface-cli download
|
| 52 |
+
),
|
| 53 |
+
num_train_steps=60_000,
|
| 54 |
+
),
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
Then run inference exactly as upstream does.
|
| 58 |
+
|
| 59 |
+
## Inference contract (observation dict)
|
| 60 |
+
|
| 61 |
+
```python
|
| 62 |
+
obs = {
|
| 63 |
+
"observation/joint_position": np.ndarray(7,),
|
| 64 |
+
"observation/gripper_position": float, # scalar, becomes 1d
|
| 65 |
+
"observation/exterior_image_1_left": uint8 H×W×3, # → base_0_rgb
|
| 66 |
+
"observation/wrist_image_left": uint8 H×W×3, # → left_wrist_0_rgb
|
| 67 |
+
"prompt": "<natural language sentence>",
|
| 68 |
+
}
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
## Training data
|
| 72 |
+
|
| 73 |
+
[`zhuoKCL/prgvla_sorting`](https://huggingface.co/datasets/zhuoKCL/prgvla_sorting) (LeRobot v2.1).
|