haohw
/

real_robot_v4action_classifiers

mode-classifier

Model card Files Files and versions

real_robot_v4action_classifiers / README.md

haohw's picture

Upload folder using huggingface_hub

11b2fe8 verified 19 days ago

|

history blame contribute delete

3.08 kB

	---
	license: apache-2.0
	tags:
	- mode-classifier
	- real-robot
	- pi05
	- vla
	- mode-editing
	---

	# real_robot_v4action_classifiers

	v3-style HiddenStateClassifier checkpoints (`hidden + action + progress -> P(target)`)
	for real-robot pk (Pass Knife) and pb (Push Block) tasks. Used as the cg_loss target
	in `action_v4` mode editing on pi0.5 VLA, and as a sidecar inference logger via
	`scripts/serve_policy_with_classifier.py` (in [openpi](https://github.com/haohww/behavior-uncloning)).

	All classifiers trained with the v4action defaults (`lam_grad=1.0, lam_corr=2.0,
	grad_min_thresh=1.5`) — substantially more action-aware than the original v3
	classifiers. See `docs/real_robot_v2.md` §9.8–9.10 in `behavior-uncloning` for full
	probe results.

	## Files

	\| file \| task \| semantics \| a=0 ablation acc \| grad ratio \|
	\|---\|---\|---\|---\|---\|
	\| `pk_left.pt` \| pk \| binary: 1 if mode==left, else 0 \| 24.3% \| 0.62 \|
	\| `pk_right.pt` \| pk \| binary: 1 if mode==right, else 0 \| 0.2% \| 0.70 \|
	\| `pk_sharp.pt` \| pk \| binary: 1 if mode==sharp, else 0 \| 8.2% \| 0.65 \|
	\| `pk_keepLR.pt` \| pk \| binary: 1 if mode in {left,right}, else 0 (synthetic) \| — \| — \|
	\| `pb_from_left.pt` \| pb \| binary: 1 if mode==from_left, else 0 \| 15.4% \| 0.50 \|
	\| `pb_from_right.pt` \| pb \| binary: 1 if mode==from_right, else 0 \| 75.2% \| 0.53 \|

	(a=0 ablation acc and grad ratio are measures of action-sensitivity. Lower a=0 acc
	and higher grad ratio = stronger steering signal.)

	## Architecture (mirrors `train_hidden_classifier_v4.py:V3Classifier`)

	```
	hidden (2048-D) → hidden_proj (2048→256, LayerNorm, ReLU) ─┐
	action (8d × 10 steps = 80-D) ├─ concat (400-D)
	progress (sinusoidal embed, 64-D) ─┘
	→ backbone (400→512, LayerNorm, ReLU, →256, LayerNorm, ReLU)
	→ dropout(0.5) (eval=identity)
	→ head (256 → 1) → sigmoid → P(target \| h, a, p)
	```

	All checkpoints are PyTorch `state_dict`s. Load with:

	```python
	import torch
	from openpi_serve_policy_with_classifier_helper import HiddenStateClassifier
	sd = torch.load("pk_left.pt", map_location="cpu")
	clf = HiddenStateClassifier(action_dim=80)
	clf.load_state_dict(sd); clf.eval()
	```

	(or use `scripts/serve_policy_with_classifier.py` directly which loads them automatically)

	## Usage in serve_policy_with_classifier.py

	```bash
	huggingface-cli download haohw/real_robot_v4action_classifiers \
	--local-dir ~/v4action_classifiers

	uv run scripts/serve_policy_with_classifier.py \
	policy:checkpoint --policy.config pi05_real_pk_mixed --policy.dir <edited-ckpt> \
	--classifier-name=left --classifier-ckpt=$HOME/v4action_classifiers/pk_left.pt \
	--classifier-name=right --classifier-ckpt=$HOME/v4action_classifiers/pk_right.pt \
	--classifier-name=sharp --classifier-ckpt=$HOME/v4action_classifiers/pk_sharp.pt
	```

	Each inference call will print one log line:
	```
	[classifier @ infer #42] P(left)=0.034 \| P(right)=0.812 \| P(sharp)=0.154
	argmax=right margin=0.658 infer_ms=187.3
	```