Behavior Uncloning โ GD (step 20)
VLA unlearning checkpoint: pi0.5 model with GD unlearning applied.
Results
| Metric | Value |
|---|---|
| Method | GD |
| Training Steps | 20 |
| Forget Task | "turn on the stove" (LIBERO-Goal T6) |
| Forget SR | 0% (baseline: 100%) |
| Retain SR | 95.6% (baseline: 97.8%) |
| HM | 0.98 |
Usage
# Serve with openpi
uv run scripts/serve_policy.py --env LIBERO policy:checkpoint \
--policy.config pi05_libero --policy.dir <path_to_checkpoint>
Method
Gradient Difference: L = -L_forget + L_retain. Adds retain regularization to prevent catastrophic forgetting.
Base model: pi0.5 LIBERO
See full report: experiment_report.md