Update README.md
#1
by
Shalfunnn
- opened
README.md
CHANGED
|
@@ -12,7 +12,7 @@
|
|
| 12 |
## Model Description
|
| 13 |
|
| 14 |
We introduce **WALL-OSS**, an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve (1) embodiment-aware vision--language understanding, (2) strong language--action association, and (3) robust manipulation capability.
|
| 15 |
-
Our approach employs a tightly coupled architecture and multi-strategies training curriculum that enables
|
| 16 |
Our results show that WALL-OSS attains high success on complex long-horizon manipulations, demonstrates strong instruction-following capabilities, complex understanding and reasoning, and outperforms strong baselines, thereby providing a reliable and scalable path from VLMs to embodied foundation models.
|
| 17 |
|
| 18 |
## Quick Start
|
|
|
|
| 12 |
## Model Description
|
| 13 |
|
| 14 |
We introduce **WALL-OSS**, an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve (1) embodiment-aware vision--language understanding, (2) strong language--action association, and (3) robust manipulation capability.
|
| 15 |
+
Our approach employs a tightly coupled architecture and multi-strategies training curriculum that enables Unified Cross-Level CoT—seamlessly unifying instruction reasoning, subgoal decomposition, and fine-grained action synthesis within a single differentiable framework.
|
| 16 |
Our results show that WALL-OSS attains high success on complex long-horizon manipulations, demonstrates strong instruction-following capabilities, complex understanding and reasoning, and outperforms strong baselines, thereby providing a reliable and scalable path from VLMs to embodied foundation models.
|
| 17 |
|
| 18 |
## Quick Start
|