Robotics
Safetensors
beingh
vla
zawnpn commited on
Commit
bb31ffc
Β·
verified Β·
1 Parent(s): 9c4a868

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -18
README.md CHANGED
@@ -10,23 +10,33 @@ pipeline_tag: robotics
10
 
11
  # Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization
12
 
 
 
 
 
13
  <div align="center">
14
 
15
  [![Blog](https://img.shields.io/badge/Blog-Being--H05-green)](https://research.beingbeyond.com/being-h05)
16
- [![arXiv](https://img.shields.io/badge/arXiv-2601.xxxxx-b31b1b.svg)](https://research.beingbeyond.com/projects/being-h05/being-h05.pdf)
17
  [![Models](https://img.shields.io/badge/πŸ€—%20Hugging%20Face-Models-yellow)](https://huggingface.co/collections/BeingBeyond/being-h05)
 
18
 
19
  </div>
20
 
21
  Being-H0.5 is a foundational VLA model that scales human-centric learning with UniHand-2.0 and a unified action space to enable robust cross-embodiment robot control.
22
 
 
 
 
 
 
23
  *(For our previous Being-H0 version, please visit the [being-h0](https://github.com/BeingBeyond/Being-H/tree/being-h0) branch.)*
24
 
25
  ## News
26
 
27
- - **[2026-01-20]**: We release the **Being-H0.5** codebase! Check our [Hugging Face Model Collections](https://huggingface.co/collections/BeingBeyond/being-h05) for pretrained and post-trained models. πŸ”₯πŸ”₯πŸ”₯
28
  - **[2025-08-02]**: We release the **Being-H0** codebase and pretrained models! Check our [Hugging Face Model Collections](https://huggingface.co/collections/BeingBeyond/being-h0) for more details. πŸ”₯πŸ”₯πŸ”₯
29
- - **[2025-07-21]**: We publish **Being-H0**! Check our paper [here](https://arxiv.org/abs/2507.15597). 🌟🌟🌟
30
 
31
  ## Model Checkpoints
32
 
@@ -34,11 +44,12 @@ Download models from Hugging Face:
34
 
35
  | Model Type | Model Name | Parameters | Description |
36
  |------------|------------|------------|-------------|
37
- | **VLA Pretrained** | [Being-H05-2B](https://huggingface.co/BeingBeyond/Being-H05-2B) | 2B | Base vision-language-action model |
38
  | **VLA Specialist** | [Being-H05-2B_libero](https://huggingface.co/BeingBeyond/Being-H05-2B_libero) | 2B | Post-trained on LIBERO benchmark |
39
  | **VLA Specialist** | [Being-H05-2B_robocasa](https://huggingface.co/BeingBeyond/Being-H05-2B_robocasa) | 2B | Post-trained on RoboCasa kitchen tasks |
40
  | **VLA Generalist** | [Being-H05-2B_libero_robocasa](https://huggingface.co/BeingBeyond/Being-H05-2B_libero_robocasa) | 2B | Post-trained on both LIBERO and RoboCasa |
41
 
 
42
 
43
  ## Setup
44
 
@@ -137,16 +148,6 @@ torchrun --nproc_per_node=8 BeingH/train/train.py \
137
  --action_chunk_length 16
138
  ```
139
 
140
- ## TODO
141
-
142
- The following features are planned for future implementation:
143
-
144
- - [ ] Complete pretraining scripts and documentation
145
- - [ ] Complete post-training scripts for all benchmarks
146
- - [ ] Detailed training and data documentation
147
- - [ ] Out-of-the-box real robot pretrained checkpoints
148
- - [ ] Benchmark evaluation scripts for all supported tasks
149
-
150
  ## Contributing and Building on Being-H05
151
 
152
  We encourage researchers and practitioners to leverage Being-H05 as a foundation for their own experiments and applications. Whether you're adapting Being-H05 to new robotic platforms, exploring novel manipulation tasks, or extending the model to new domains, our modular codebase is designed to support your innovations. We welcome contributions of all kinds - from bug fixes and documentation improvements to new features and model architectures. By building on Being-H05 together, we can advance the field of vision-language-action modeling and enable robots to perform more complex and diverse manipulation tasks. Join us in making robotic manipulation more capable, robust, and accessible to all.
@@ -156,6 +157,7 @@ We encourage researchers and practitioners to leverage Being-H05 as a foundation
156
  Being-H05 builds on the following excellent open-source projects:
157
 
158
  - [InternVL](https://github.com/OpenGVLab/InternVL): Vision-Language model backbone
 
159
  - [Qwen](https://github.com/QwenLM/Qwen): Language model and MoE expert
160
  - [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO): Benchmark for lifelong robot learning
161
  - [RoboCasa](https://github.com/robocasa/robocasa): Large-scale simulation benchmark for everyday tasks
@@ -175,9 +177,21 @@ If you find our work useful, please consider citing us and give a star to our re
175
  **Being-H05**
176
 
177
  ```bibtex
178
- @misc{beingbeyond2026beingh05,
179
- title={Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization},
180
- author={BeingBeyond Team},
 
181
  year={2026}
182
  }
183
- ```
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  # Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization
12
 
13
+ <p align="center">
14
+ <img src="https://raw.githubusercontent.com/BeingBeyond/Being-H/refs/heads/main/assets/being-h05.png" width="300"/>
15
+ <p>
16
+
17
  <div align="center">
18
 
19
  [![Blog](https://img.shields.io/badge/Blog-Being--H05-green)](https://research.beingbeyond.com/being-h05)
20
+ [![Paper](https://img.shields.io/badge/arXiv-Paper-b31b1b.svg)](https://arxiv.org/pdf/2601.12993)
21
  [![Models](https://img.shields.io/badge/πŸ€—%20Hugging%20Face-Models-yellow)](https://huggingface.co/collections/BeingBeyond/being-h05)
22
+ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](./LICENSE)
23
 
24
  </div>
25
 
26
  Being-H0.5 is a foundational VLA model that scales human-centric learning with UniHand-2.0 and a unified action space to enable robust cross-embodiment robot control.
27
 
28
+ <div align="center">
29
+ <video src="https://github.com/user-attachments/assets/36714389-e737-4b11-8dcf-9076cc9f1d69" controls>
30
+ </video>
31
+ </div>
32
+
33
  *(For our previous Being-H0 version, please visit the [being-h0](https://github.com/BeingBeyond/Being-H/tree/being-h0) branch.)*
34
 
35
  ## News
36
 
37
+ - **[2026-01-20]**: We publish the **Being-H0.5**! Check our [Paper](https://arxiv.org/pdf/2601.12993) for technical details and [Hugging Face Model Collections](https://huggingface.co/collections/BeingBeyond/being-h05) for pretrained and post-trained models. πŸ”₯πŸ”₯πŸ”₯
38
  - **[2025-08-02]**: We release the **Being-H0** codebase and pretrained models! Check our [Hugging Face Model Collections](https://huggingface.co/collections/BeingBeyond/being-h0) for more details. πŸ”₯πŸ”₯πŸ”₯
39
+ - **[2025-07-21]**: We publish **Being-H0**! Check our paper [here](https://arxiv.org/pdf/2507.15597). 🌟🌟🌟
40
 
41
  ## Model Checkpoints
42
 
 
44
 
45
  | Model Type | Model Name | Parameters | Description |
46
  |------------|------------|------------|-------------|
47
+ | **VLA Pretrained** | [Being-H05-2B](https://huggingface.co/BeingBeyond/Being-H05-2B) | 2B | Base vision-language-action model (preview) |
48
  | **VLA Specialist** | [Being-H05-2B_libero](https://huggingface.co/BeingBeyond/Being-H05-2B_libero) | 2B | Post-trained on LIBERO benchmark |
49
  | **VLA Specialist** | [Being-H05-2B_robocasa](https://huggingface.co/BeingBeyond/Being-H05-2B_robocasa) | 2B | Post-trained on RoboCasa kitchen tasks |
50
  | **VLA Generalist** | [Being-H05-2B_libero_robocasa](https://huggingface.co/BeingBeyond/Being-H05-2B_libero_robocasa) | 2B | Post-trained on both LIBERO and RoboCasa |
51
 
52
+ Note: the vision part is 224px by default.
53
 
54
  ## Setup
55
 
 
148
  --action_chunk_length 16
149
  ```
150
 
 
 
 
 
 
 
 
 
 
 
151
  ## Contributing and Building on Being-H05
152
 
153
  We encourage researchers and practitioners to leverage Being-H05 as a foundation for their own experiments and applications. Whether you're adapting Being-H05 to new robotic platforms, exploring novel manipulation tasks, or extending the model to new domains, our modular codebase is designed to support your innovations. We welcome contributions of all kinds - from bug fixes and documentation improvements to new features and model architectures. By building on Being-H05 together, we can advance the field of vision-language-action modeling and enable robots to perform more complex and diverse manipulation tasks. Join us in making robotic manipulation more capable, robust, and accessible to all.
 
157
  Being-H05 builds on the following excellent open-source projects:
158
 
159
  - [InternVL](https://github.com/OpenGVLab/InternVL): Vision-Language model backbone
160
+ - [Bagel](https://github.com/ByteDance-Seed/Bagel): Training framework
161
  - [Qwen](https://github.com/QwenLM/Qwen): Language model and MoE expert
162
  - [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO): Benchmark for lifelong robot learning
163
  - [RoboCasa](https://github.com/robocasa/robocasa): Large-scale simulation benchmark for everyday tasks
 
177
  **Being-H05**
178
 
179
  ```bibtex
180
+ @article{beingbeyond2026beingh05,
181
+ title={Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization},
182
+ author={Luo, Hao and Wang, Ye and Zhang, Wanpeng and Zheng, Sipeng and Xi, Ziheng and Xu, Chaoyi and Xu, Haiweng and Yuan, Haoqi and Zhang, Chi and Wang, Yiqing and Feng, Yicheng and Lu, Zongqing},
183
+ journal={arXiv preprint arXiv:2601.12993},
184
  year={2026}
185
  }
186
+ ```
187
+
188
+ **Being-H0**
189
+
190
+ ```bibtex
191
+ @article{beingbeyond2025beingh0,
192
+ title={Being-h0: vision-language-action pretraining from large-scale human videos},
193
+ author={Luo, Hao and Feng, Yicheng and Zhang, Wanpeng and Zheng, Sipeng and Wang, Ye and Yuan, Haoqi and Liu, Jiazheng and Xu, Chaoyi and Jin, Qin and Lu, Zongqing},
194
+ journal={arXiv preprint arXiv:2507.15597},
195
+ year={2025}
196
+ }
197
+ ```