Behavior Uncloning — KL (step 20)
VLA unlearning checkpoint: pi0.5 model with KL unlearning applied.
Results
| Metric | Value |
|---|---|
| Method | KL |
| Training Steps | 20 |
| Forget Task | "turn on the stove" (LIBERO-Goal T6) |
| Forget SR | 40% (baseline: 100%) |
| Retain SR | 95.6% (baseline: 97.8%) |
| HM | 0.81 |
Usage
# Serve with openpi
uv run scripts/serve_policy.py --env LIBERO policy:checkpoint \
--policy.config pi05_libero --policy.dir <path_to_checkpoint>
Method
KL Minimization: L = -L_forget + γ·L_retain + γ·||L_cur - L_orig||². Anchors retain behavior to original model.
Base model: pi0.5 LIBERO
See full report: experiment_report.md