Improve model card structure and description

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +31 -4
README.md CHANGED
@@ -2,10 +2,37 @@
2
  license: apache-2.0
3
  pipeline_tag: robotics
4
  ---
5
- This repository contains the CapVector official checkpoints.
6
 
7
- Paper: [CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models](https://arxiv.org/abs/2605.10903)
8
 
9
- Project page:https://capvector.github.io
10
 
11
- Code:https://github.com/OpenHelix-Team/CapVector
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  pipeline_tag: robotics
4
  ---
 
5
 
6
+ # CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models
7
 
8
+ [CapVector](https://capvector.github.io/) is a training recipe for vision-language-action (VLA) models that extracts a transferable capability vector from the parameter difference between auxiliary-objective SFT methods and standard SFT methods. This vector is merged into a pretrained VLA to form a stronger initialization, and downstream adaptation uses standard SFT with a lightweight orthogonal regularization loss to preserve the injected capability.
9
 
10
+ - **Paper:** [CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models](https://arxiv.org/abs/2605.10903)
11
+ - **Project Page:** [https://capvector.github.io](https://capvector.github.io)
12
+ - **Code:** [https://github.com/OpenHelix-Team/CapVector](https://github.com/OpenHelix-Team/CapVector)
13
+
14
+ ## Summary
15
+
16
+ CapVector addresses the challenge where pretrained VLA models often fail to effectively improve performance or reduce adaptation costs during standard supervised finetuning. By decoupling the two core objectives of auxiliary-objective SFT—enhancing general capabilities and fitting task-specific action distributions—within the parameter space, CapVector creates a "capability vector." When merged with pretrained parameters and augmented with a lightweight orthogonal regularization loss, the model achieves performance comparable to auxiliary finetuned baselines with significantly reduced computational overhead.
17
+
18
+ ## 🌟 Key Features
19
+ - **Efficient downstream adaptation**: CapVector recovers much of the benefit of auxiliary-objective SFT methods, while keeping the downstream overhead close to standard SFT.
20
+ - **Versatility**: CapVector fits for OpenVLA-based, OpenPi-based, and StarVLA-based backbones.
21
+ - **Generalization**: CapVector is designed to transfer across tasks, environments, and robot embodiments.
22
+
23
+ ## Citation
24
+
25
+ If you find this work useful, please cite:
26
+
27
+ ```bibtex
28
+ @article{song2026capvector,
29
+ title = {CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models},
30
+ author = {Song, Wenxuan and Zhao, Han and Li, Fuhao and Zhou, Ziyang and Wang, Xi and Lyu, Jing and Ding, Pengxiang and Wang, Yan and Wang, Donglin and Li, Haoang},
31
+ journal = {arXiv preprint arXiv:2605.10903},
32
+ year = {2026}
33
+ }
34
+ ```
35
+
36
+ ## Acknowledgments
37
+
38
+ CapVector builds on and interfaces with several open-source projects, including [OpenVLA-OFT](https://github.com/moojink/openvla-oft) and [OpenPI](https://github.com/Physical-Intelligence/openpi).