Behavior Uncloning — KL (step 20)

VLA unlearning checkpoint: pi0.5 model with KL unlearning applied.

Results

Metric	Value
Method	KL
Training Steps	20
Forget Task	"turn on the stove" (LIBERO-Goal T6)
Forget SR	40% (baseline: 100%)
Retain SR	95.6% (baseline: 97.8%)
HM	0.81

# Serve with openpi
uv run scripts/serve_policy.py --env LIBERO policy:checkpoint \
    --policy.config pi05_libero --policy.dir <path_to_checkpoint>

KL Minimization: L = -L_forget + γ·L_retain + γ·||L_cur - L_orig||². Anchors retain behavior to original model.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview