| --- |
| license: apache-2.0 |
| tags: |
| - mode-classifier |
| - real-robot |
| - pi05 |
| - vla |
| - mode-editing |
| --- |
| |
| # real_robot_v4action_classifiers |
| |
| v3-style HiddenStateClassifier checkpoints (`hidden + action + progress -> P(target)`) |
| for real-robot pk (Pass Knife) and pb (Push Block) tasks. Used as the cg_loss target |
| in `action_v4` mode editing on pi0.5 VLA, and as a sidecar inference logger via |
| `scripts/serve_policy_with_classifier.py` (in [openpi](https://github.com/haohww/behavior-uncloning)). |
|
|
| All classifiers trained with the v4action defaults (`lam_grad=1.0, lam_corr=2.0, |
| grad_min_thresh=1.5`) β substantially more action-aware than the original v3 |
| classifiers. See `docs/real_robot_v2.md` Β§9.8β9.10 in `behavior-uncloning` for full |
| probe results. |
|
|
| ## Files |
|
|
| | file | task | semantics | a=0 ablation acc | grad ratio | |
| |---|---|---|---|---| |
| | `pk_left.pt` | pk | binary: 1 if mode==left, else 0 | 24.3% | 0.62 | |
| | `pk_right.pt` | pk | binary: 1 if mode==right, else 0 | 0.2% | 0.70 | |
| | `pk_sharp.pt` | pk | binary: 1 if mode==sharp, else 0 | 8.2% | 0.65 | |
| | `pk_keepLR.pt` | pk | binary: 1 if mode in {left,right}, else 0 (synthetic) | β | β | |
| | `pb_from_left.pt` | pb | binary: 1 if mode==from_left, else 0 | 15.4% | 0.50 | |
| | `pb_from_right.pt` | pb | binary: 1 if mode==from_right, else 0 | 75.2% | 0.53 | |
|
|
| (a=0 ablation acc and grad ratio are measures of action-sensitivity. Lower a=0 acc |
| and higher grad ratio = stronger steering signal.) |
|
|
| ## Architecture (mirrors `train_hidden_classifier_v4.py:V3Classifier`) |
| |
| ``` |
| hidden (2048-D) β hidden_proj (2048β256, LayerNorm, ReLU) ββ |
| action (8d Γ 10 steps = 80-D) ββ concat (400-D) |
| progress (sinusoidal embed, 64-D) ββ |
| β backbone (400β512, LayerNorm, ReLU, β256, LayerNorm, ReLU) |
| β dropout(0.5) (eval=identity) |
| β head (256 β 1) β sigmoid β P(target | h, a, p) |
| ``` |
| |
| All checkpoints are PyTorch `state_dict`s. Load with: |
| |
| ```python |
| import torch |
| from openpi_serve_policy_with_classifier_helper import HiddenStateClassifier |
| sd = torch.load("pk_left.pt", map_location="cpu") |
| clf = HiddenStateClassifier(action_dim=80) |
| clf.load_state_dict(sd); clf.eval() |
| ``` |
| |
| (or use `scripts/serve_policy_with_classifier.py` directly which loads them automatically) |
| |
| ## Usage in serve_policy_with_classifier.py |
| |
| ```bash |
| huggingface-cli download haohw/real_robot_v4action_classifiers \ |
| --local-dir ~/v4action_classifiers |
|
|
| uv run scripts/serve_policy_with_classifier.py \ |
| policy:checkpoint --policy.config pi05_real_pk_mixed --policy.dir <edited-ckpt> \ |
| --classifier-name=left --classifier-ckpt=$HOME/v4action_classifiers/pk_left.pt \ |
| --classifier-name=right --classifier-ckpt=$HOME/v4action_classifiers/pk_right.pt \ |
| --classifier-name=sharp --classifier-ckpt=$HOME/v4action_classifiers/pk_sharp.pt |
| ``` |
| |
| Each inference call will print one log line: |
| ``` |
| [classifier @ infer #42] P(left)=0.034 | P(right)=0.812 | P(sharp)=0.154 |
| argmax=right margin=0.658 infer_ms=187.3 |
| ``` |
| |