Robotics
LeRobot
English
vla
pi05
subtask
openpi
orbax
Student Watery commited on
Commit
6a98371
·
verified ·
1 Parent(s): c20eb85

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - robotics
7
+ - vla
8
+ - pi05
9
+ - subtask
10
+ - openpi
11
+ - lerobot
12
+ - orbax
13
+ datasets:
14
+ - physical-intelligence/libero
15
+ pipeline_tag: robotics
16
+ ---
17
+
18
+ # pi0.5 subtask fine-tune
19
+
20
+ A 100-step fine-tune of `pi05_base` for subtask generation from the original [pi05 paper](https://www.pi.website/download/pi05.pdf).
21
+ We reproduced steps from a community issue thread on openpi that studies this [#701](https://github.com/Physical-Intelligence/openpi/issues/701).
22
+
23
+ ## TL;DR
24
+
25
+ - **Start weights**: `gs://openpi-assets/checkpoints/pi05_base/params`
26
+ - **Config**: `pi05_subtask_libero` (adds `Pi05Subtask` head: joint flow-matching + CE-on-subtask-tokens loss)
27
+ - **Training**: 100 steps × batch 8 on 30 LIBERO episodes, 1× H100 on Modal
28
+ - **Final loss**: 3.04 → 0.23
29
+
30
+ ## Loading
31
+
32
+ ```python
33
+ from pathlib import Path
34
+ import jax
35
+ import jax.numpy as jnp
36
+ import flax.nnx as nnx
37
+ from huggingface_hub import hf_hub_download
38
+ import tarfile
39
+
40
+ from openpi.models import model as _model
41
+ from openpi.models.pi0 import Pi0
42
+ from openpi.models.pi0_config import Pi0Config
43
+
44
+ # 1. Download + extract
45
+ tar = hf_hub_download("swatery/pi05-subtask",
46
+ "jax/pi05_subtask.tar")
47
+ tarfile.open(tar).extractall(".")
48
+ ckpt = Path("99")
49
+
50
+ # 2. Build model and restore weights
51
+ config = Pi0Config(pi05=True)
52
+ model = config.create(jax.random.key(0))
53
+ params = _model.restore_params(ckpt / "params", dtype=jnp.bfloat16)
54
+ nnx.update(model, nnx.State(params))
55
+ model.eval()
56
+ ```
57
+
58
+ For end-to-end subtask generation (JIT-compiled AR decode with ASCII vocab mask over PaliGemma's LM head), see the `SubtaskGenerator` implementation in [openpi/hosting](https://github.com/Hebbian-Robotics/openpi) `src/hosting/subtask_generator.py`.
59
+ That module loads a checkpoint like this one and calls `.generate(prompt, images)`.
60
+
61
+ ## Training details
62
+
63
+ | | |
64
+ |---|---|
65
+ | Architecture | pi0.5 — PaliGemma + Gemma action expert, with `Pi05Subtask` head |
66
+ | Loss | Flow-matching (action) + cross-entropy (subtask tokens) |
67
+ | Knowledge insulation | Yes — LM backbone receives only CE gradients |
68
+ | Steps | 100 |
69
+ | Batch size | 8 (global, single device) |
70
+ | Optimizer | AdamW, cosine schedule, peak LR 5e-5, warmup 10k (only 100 steps used, so effectively constant warmup) |
71
+ | EMA decay | 0.999 |
72
+ | Precision | bfloat16 |
73
+ | Hardware | 1× NVIDIA H100 80GB (Modal) |
74
+ | Wall-clock | ~10 min training + ~5 min data/weight fetch |
75
+
76
+ ### Data
77
+
78
+ - **Dataset**: first 30 episodes of `physical-intelligence/libero` chunk-000 (~8,294 frames)
79
+ - **Norm stats**: reused `pi05_libero`'s precomputed full-dataset stats from `gs://openpi-assets/checkpoints/pi05_libero/assets/`
80
+ - **Subtask annotation**: **identity** — `high_prompt = low_prompt = task_prompt`
81
+ (real hierarchical subtask annotations for LIBERO are not publicly available)
82
+
83
+ ## References
84
+
85
+ - https://www.pi.website/blog/pi05
86
+ - https://github.com/Physical-Intelligence/openpi (upstream pi0.5 implementation)
87
+ - https://github.com/Physical-Intelligence/openpi/issues/701 (community issue thread reproducing subtask generation)
88
+ - https://github.com/LisavilaLee/openpi_with_subtask (fork with training example)
89
+
90
+ ## License
91
+
92
+ - Code & fine-tuned weights: Apache 2.0 (inherited from openpi)
93
+ - Gemma dependency: this checkpoint is derived from Google's Gemma via PaliGemma. Usage is subject to the Gemma Terms of Use in addition to Apache 2.0.