Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge
Paper • 2512.06951 • Published • 4
5 × 30k-step LoRA-free fine-tunes of pi05_base on LIBERO (4 task suite, 10 tasks each), trained on 8×A800 with FSDP. Companion checkpoints for the ablation study in github.com/Xuewei-Huang/BehaPi.
| Config | Path | Spatial | Object | Goal | LIB10 | Mean | Δ vs BL |
|---|---|---|---|---|---|---|---|
| Baseline (vanilla pi0.5) | libero_baseline_a800_train/30000/ |
91.8 | 94.6 | 90.8 | 86.2 | 90.85 | — |
| T1: Per-timestamp Normalization | t1_per_ts_norm_a800_train/30000/ |
94.0 | 96.0 | 92.4 | 88.2 | 92.65 | +1.80 |
| T2: Correlated FM + K=8 Multi-sample | t2_corr_multi_a800_train/30000/ |
91.4 | 97.0 | 92.0 | 87.8 | 92.05 | +1.20 |
| T3: KV Transform | t3_kv_transform_a800_train/30000/ |
90.4 | 97.8 | 92.6 | 85.0 | 91.45 | +0.60 |
| Combined (T1+T2+T3) | combined_all_a800_train/30000/ |
93.0 | 96.0 | 92.6 | 87.8 | 92.35 | +1.50 |
Full per-suite analysis + debug stories in the code repo's RESULTS.md.
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="Xuewei-Huang/BehaPi-ckpts",
allow_patterns="t1_per_ts_norm_a800_train/*",
local_dir="./behapi_ckpts",
)
from openpi.policies import policy_config as pc
from openpi.training import config as cfg
train_cfg = cfg.get_config("t1_per_ts_norm_a800_train")
policy = pc.create_trained_policy(train_cfg, "./behapi_ckpts/t1_per_ts_norm_a800_train/30000")
# policy.infer({"observation/image": ..., "observation/wrist_image": ..., "observation/state": ..., "prompt": ...})
You'll need the matching code from the corresponding branch:
t1_per_ts_norm_a800_train → branch dev/trick/per-ts-normt2_corr_multi_a800_train → branch dev/trick/correlated-and-multit3_kv_transform_a800_train → branch dev/trick/kv-transformcombined_all_a800_train → branch dev/trick/combined-alllibero_baseline_a800_train → branch dev/port-b1k-tricksgs://openpi-assets/checkpoints/pi05_base/params (Physical Intelligence)xuewei-huang/libero HF dataset)