File size: 2,320 Bytes
500d8a2 366fb62 500d8a2 366fb62 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | ---
license: cc-by-nc-sa-4.0
language:
- en
- zh
size_categories:
- n>1T
tags:
- robotics
- real-world
- dual-arm
- whole body control
- manipulation
datasets:
- OpenGalaxea/Galaxea-Open-World-Dataset
---
# 🚀 Galaxea Open-World Dataset and G0 Dual-System VLA Model
[](https://opengalaxea.github.io/G0/)
[](https://arxiv.org/abs/2509.00576)
[](https://opengalaxea.github.io/G0/)
[](https://opengalaxea.github.io/G0/visualizer/index.html)
[](https://www.modelscope.cn/organization/Galaxea)
[](https://x.com/Galaxea_x)
[](https://www.linkedin.com/company/galaxeadynamics/posts/?feedView=all&viewAsMember=true)
G0-VLA architecture and training pipeline: Stage 1 pre-trains a vision-language model on cross-embodiment data in an autoregressive manner. Stage 2 and post-train share the same model structure, trained on Galaxea open-world data with embodiment-specific views and high-level and subtask instructions, by supervising the Action Transformer’s action reconstruction with a flow- matching loss.

In this repo, you can find:
- [x] G0_3B_base.pt: **Model Weights after Stage2 Pretraining**
- [x] G0_3B_base_dataset_statistics: **Statistics for Dataset Used in Pretraining**
## 📜 Citation
All the data and code within this repo are under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). If you use our dataset or models, please cite:
```bibtex
@article{galaxea2025,
title={Galaxea G0: Open-World Dataset and Dual-System VLA Model},
author={Galaxea Team},
journal={arXiv preprint arXiv:2509.00576},
year={2025}
} |