SSSSphinx commited on
Commit
5ad6790
·
verified ·
1 Parent(s): da96503

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +136 -3
README.md CHANGED
@@ -1,3 +1,136 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ library_name: transformers
7
+ tags:
8
+ - robotics
9
+ - vision-language-action
10
+ - reinforcement-learning
11
+ - embodied-ai
12
+ - openpi
13
+ - rlinf
14
+ pipeline_tag: reinforcement-learning
15
+ ---
16
+
17
+ # SA-VLA: Spatially-Aware Reinforcement Learning for Flow-Matching VLA Models
18
+
19
+ SA-VLA is a spatially-aware reinforcement learning approach for flow-matching Vision-Language-Action (VLA) models.
20
+ It is developed on top of the RLinf framework and targets robust embodied manipulation with stronger spatial generalization.
21
+
22
+ - 📄 Paper: https://arxiv.org/abs/2602.00743
23
+ - 🌐 Project Page: https://xupan.top/Projects/savla
24
+ - 🧩 Codebase: https://github.com/TwSphinx54/SA-VLA
25
+ - 🏗️ RL Framework: https://github.com/RLinf/RLinf
26
+
27
+ ---
28
+
29
+ ## Model Summary
30
+
31
+ SA-VLA fuses visual tokens and spatial tokens into geometry-aware embeddings, then optimizes the policy via:
32
+ 1. **Step-level dense rewards**
33
+ 2. **Spatially-conditioned exploration (SCAN)**
34
+ 3. **RL fine-tuning on embodied benchmarks**
35
+
36
+ This repository provides model weights used in SA-VLA experiments.
37
+
38
+ ---
39
+
40
+ ## Intended Use
41
+
42
+ - RL fine-tuning and evaluation for embodied manipulation tasks
43
+ - Experiments on LIBERO / LIBERO-PLUS style benchmarks
44
+ - Research on spatial reasoning in VLA post-training
45
+
46
+ > For complete environment setup, training scripts, and benchmark integration, use the full code repository:
47
+ > https://github.com/TwSphinx54/SA-VLA
48
+
49
+ ---
50
+
51
+ ## Quick Start (with SA-VLA codebase)
52
+
53
+ ### 1) Clone project
54
+ ```bash
55
+ git clone https://github.com/TwSphinx54/SA-VLA.git
56
+ cd SA-VLA
57
+ ```
58
+
59
+ ### 2) Setup environment
60
+ Follow the RLinf setup in:
61
+ - `README.RLinf.md` (framework/environment)
62
+ - `scripts/setup_container.sh` (extra container setup)
63
+
64
+ ### 3) Place weights
65
+ Put downloaded checkpoints under:
66
+ ```text
67
+ weights/
68
+ ```
69
+
70
+ ### 4) Run training / evaluation
71
+ ```bash
72
+ # RL training
73
+ bash examples/embodiment/run_embodiment.sh libero_spatial_ppo_openpi_pi05
74
+
75
+ # Evaluation
76
+ bash examples/embodiment/eval_embodiment.sh libero_spatial_ppo_openpi_pi05_eval
77
+ ```
78
+
79
+ ---
80
+
81
+ ## Recommended Weight Layout
82
+
83
+ ```text
84
+ weights
85
+ |-- Pi05-LIBERO
86
+ |-- Pi05-VGGT-LIBERO-FUSER-SFT_BF16
87
+ `-- RLinf-Pi05-SFT
88
+ ```
89
+
90
+ ---
91
+
92
+ ## Dataset Notes
93
+
94
+ The SA-VLA experiments rely on LIBERO-family data and benchmark configs.
95
+ For subset/full-set switching, modify benchmark mapping in your OpenPi LIBERO installation as documented in the main repo.
96
+
97
+ ---
98
+
99
+ ## Limitations
100
+
101
+ - Requires non-trivial robotics simulation setup
102
+ - Performance depends on environment/version consistency
103
+ - Not intended for safety-critical real-world deployment without additional validation
104
+
105
+ ---
106
+
107
+ ## Citation
108
+
109
+ ```bibtex
110
+ @misc{pan2026savlaspatiallyawareflowmatchingvisionlanguageaction,
111
+ title={SA-VLA: Spatially-Aware Flow-Matching for Vision-Language-Action Reinforcement Learning},
112
+ author={Xu Pan and Zhenglin Wan and Xingrui Yu and Xianwei Zheng and Youkai Ke and Ming Sun and Rui Wang and Ziwei Wang and Ivor Tsang},
113
+ year={2026},
114
+ eprint={2602.00743},
115
+ archivePrefix={arXiv},
116
+ primaryClass={cs.RO},
117
+ url={https://arxiv.org/abs/2602.00743}
118
+ }
119
+ ```
120
+
121
+ ---
122
+
123
+ ## License
124
+
125
+ Apache-2.0
126
+
127
+ ---
128
+
129
+ ## Acknowledgments
130
+
131
+ Built upon:
132
+ - RLinf: https://github.com/RLinf/RLinf
133
+ - OpenPi: https://github.com/Physical-Intelligence/openpi
134
+ - LIBERO: https://github.com/Lifelong-Robot-Learning/LIBERO
135
+ - LIBERO-PLUS: https://github.com/sylvestf/LIBERO-plus
136
+ - VGGT: https://github.com/facebookresearch/vggt