nielsr's picture
nielsr HF Staff
Improve model card structure and description
36d62bd verified
|
raw
history blame
2.66 kB
metadata
license: apache-2.0
pipeline_tag: robotics

CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models

CapVector is a training recipe for vision-language-action (VLA) models that extracts a transferable capability vector from the parameter difference between auxiliary-objective SFT methods and standard SFT methods. This vector is merged into a pretrained VLA to form a stronger initialization, and downstream adaptation uses standard SFT with a lightweight orthogonal regularization loss to preserve the injected capability.

Summary

CapVector addresses the challenge where pretrained VLA models often fail to effectively improve performance or reduce adaptation costs during standard supervised finetuning. By decoupling the two core objectives of auxiliary-objective SFT—enhancing general capabilities and fitting task-specific action distributions—within the parameter space, CapVector creates a "capability vector." When merged with pretrained parameters and augmented with a lightweight orthogonal regularization loss, the model achieves performance comparable to auxiliary finetuned baselines with significantly reduced computational overhead.

🌟 Key Features

  • Efficient downstream adaptation: CapVector recovers much of the benefit of auxiliary-objective SFT methods, while keeping the downstream overhead close to standard SFT.
  • Versatility: CapVector fits for OpenVLA-based, OpenPi-based, and StarVLA-based backbones.
  • Generalization: CapVector is designed to transfer across tasks, environments, and robot embodiments.

Citation

If you find this work useful, please cite:

@article{song2026capvector,
  title   = {CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models},
  author  = {Song, Wenxuan and Zhao, Han and Li, Fuhao and Zhou, Ziyang and Wang, Xi and Lyu, Jing and Ding, Pengxiang and Wang, Yan and Wang, Donglin and Li, Haoang},
  journal = {arXiv preprint arXiv:2605.10903},
  year    = {2026}
}

Acknowledgments

CapVector builds on and interfaces with several open-source projects, including OpenVLA-OFT and OpenPI.