real_robot_v4action_classifiers
v3-style HiddenStateClassifier checkpoints (hidden + action + progress -> P(target))
for real-robot pk (Pass Knife) and pb (Push Block) tasks. Used as the cg_loss target
in action_v4 mode editing on pi0.5 VLA, and as a sidecar inference logger via
scripts/serve_policy_with_classifier.py (in openpi).
All classifiers trained with the v4action defaults (lam_grad=1.0, lam_corr=2.0, grad_min_thresh=1.5) β substantially more action-aware than the original v3
classifiers. See docs/real_robot_v2.md Β§9.8β9.10 in behavior-uncloning for full
probe results.
Files
| file | task | semantics | a=0 ablation acc | grad ratio |
|---|---|---|---|---|
pk_left.pt |
pk | binary: 1 if mode==left, else 0 | 24.3% | 0.62 |
pk_right.pt |
pk | binary: 1 if mode==right, else 0 | 0.2% | 0.70 |
pk_sharp.pt |
pk | binary: 1 if mode==sharp, else 0 | 8.2% | 0.65 |
pk_keepLR.pt |
pk | binary: 1 if mode in {left,right}, else 0 (synthetic) | β | β |
pb_from_left.pt |
pb | binary: 1 if mode==from_left, else 0 | 15.4% | 0.50 |
pb_from_right.pt |
pb | binary: 1 if mode==from_right, else 0 | 75.2% | 0.53 |
(a=0 ablation acc and grad ratio are measures of action-sensitivity. Lower a=0 acc and higher grad ratio = stronger steering signal.)
Architecture (mirrors train_hidden_classifier_v4.py:V3Classifier)
hidden (2048-D) β hidden_proj (2048β256, LayerNorm, ReLU) ββ
action (8d Γ 10 steps = 80-D) ββ concat (400-D)
progress (sinusoidal embed, 64-D) ββ
β backbone (400β512, LayerNorm, ReLU, β256, LayerNorm, ReLU)
β dropout(0.5) (eval=identity)
β head (256 β 1) β sigmoid β P(target | h, a, p)
All checkpoints are PyTorch state_dicts. Load with:
import torch
from openpi_serve_policy_with_classifier_helper import HiddenStateClassifier
sd = torch.load("pk_left.pt", map_location="cpu")
clf = HiddenStateClassifier(action_dim=80)
clf.load_state_dict(sd); clf.eval()
(or use scripts/serve_policy_with_classifier.py directly which loads them automatically)
Usage in serve_policy_with_classifier.py
huggingface-cli download haohw/real_robot_v4action_classifiers \
--local-dir ~/v4action_classifiers
uv run scripts/serve_policy_with_classifier.py \
policy:checkpoint --policy.config pi05_real_pk_mixed --policy.dir <edited-ckpt> \
--classifier-name=left --classifier-ckpt=$HOME/v4action_classifiers/pk_left.pt \
--classifier-name=right --classifier-ckpt=$HOME/v4action_classifiers/pk_right.pt \
--classifier-name=sharp --classifier-ckpt=$HOME/v4action_classifiers/pk_sharp.pt
Each inference call will print one log line:
[classifier @ infer #42] P(left)=0.034 | P(right)=0.812 | P(sharp)=0.154
argmax=right margin=0.658 infer_ms=187.3