Update README.md
Browse files
README.md
CHANGED
|
@@ -1,16 +1,19 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
-
|
| 6 |
|
| 7 |
-
##
|
| 8 |
|
| 9 |
-
|
| 10 |
-
-
|
| 11 |
-
-
|
| 12 |
-
- **Mixture of Experts**: Utilizes MoE architecture for efficient computation
|
| 13 |
-
- **LeRobot Compatible**: Designed to work with LeRobot datasets and frameworks
|
| 14 |
|
| 15 |
## Quick Start
|
| 16 |
|
|
@@ -148,3 +151,17 @@ The repository contains:
|
|
| 148 |
- **Inference Examples**: Multiple inference scripts and evaluation tools
|
| 149 |
- **Configuration Templates**: Ready-to-use configs for different robot setups
|
| 150 |
- **Troubleshooting Guide**: Common issues and solutions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# WALL-OSS: Igniting VLMs toward the Embodied Space
|
| 2 |
|
| 3 |
+
<div align="left">
|
| 4 |
+
|
| 5 |
+
[](https://x2-robot.feishu.cn/file/FurYbuThcofkOqxrsy7cnzUbndd)
|
| 6 |
+
[](https://huggingface.co/x-square-robot)
|
| 7 |
+
[](https://github.com/X-Square-Robot/wall-x)
|
| 8 |
+
[](https://x2robot.com/en/research/68bc2cde8497d7f238dde690)
|
| 9 |
|
| 10 |
+
</div>
|
| 11 |
|
| 12 |
+
## Model Description
|
| 13 |
|
| 14 |
+
We introduce **WALL-OSS**, an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve (1) embodiment-aware vision--language understanding, (2) strong language--action association, and (3) robust manipulation capability.
|
| 15 |
+
Our approach employs a tightly coupled architecture and multi-strategies training curriculum that enables {Unified Cross-Level CoT}—seamlessly unifying instruction reasoning, subgoal decomposition, and fine-grained action synthesis within a single differentiable framework.
|
| 16 |
+
Our results show that WALL-OSS attains high success on complex long-horizon manipulations, demonstrates strong instruction-following capabilities, complex understanding and reasoning, and outperforms strong baselines, thereby providing a reliable and scalable path from VLMs to embodied foundation models.
|
|
|
|
|
|
|
| 17 |
|
| 18 |
## Quick Start
|
| 19 |
|
|
|
|
| 151 |
- **Inference Examples**: Multiple inference scripts and evaluation tools
|
| 152 |
- **Configuration Templates**: Ready-to-use configs for different robot setups
|
| 153 |
- **Troubleshooting Guide**: Common issues and solutions
|
| 154 |
+
|
| 155 |
+
## 📚 Cite Us
|
| 156 |
+
|
| 157 |
+
If you find WALL-X or our WALL-OSS models useful, please cite:
|
| 158 |
+
|
| 159 |
+
```bibtex
|
| 160 |
+
@misc{walloss_paper_2025,
|
| 161 |
+
title = {WALL-OSS: Igniting VLMs toward the Embodied Space},
|
| 162 |
+
author = {X Square Robot},
|
| 163 |
+
year = {2025},
|
| 164 |
+
howpublished = {\url{https://x2-robot.feishu.cn/file/FurYbuThcofkOqxrsy7cnzUbndd}},
|
| 165 |
+
note = {White paper}
|
| 166 |
+
}
|
| 167 |
+
```
|