Shalfunnn commited on
Commit
294d753
·
verified ·
1 Parent(s): ad96a88

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -1,4 +1,4 @@
1
- # WALL-OSS: Igniting VLMs toward the Embodied Space
2
 
3
  <div align="left">
4
 
@@ -13,13 +13,13 @@
13
 
14
  </div>
15
 
16
- ## 🤖 Model Description
17
 
18
  We introduce **WALL-OSS**, an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve (1) embodiment-aware vision--language understanding, (2) strong language--action association, and (3) robust manipulation capability.
19
  Our approach employs a tightly coupled architecture and multi-strategies training curriculum that enables Unified Cross-Level CoT—seamlessly unifying instruction reasoning, subgoal decomposition, and fine-grained action synthesis within a single differentiable framework.
20
  Our results show that WALL-OSS attains high success on complex long-horizon manipulations, demonstrates strong instruction-following capabilities, complex understanding and reasoning, and outperforms strong baselines, thereby providing a reliable and scalable path from VLMs to embodied foundation models.
21
 
22
- <!-- ## 🎬 Video Demos
23
 
24
  <div align="center">
25
  <video width="80%" controls>
@@ -28,7 +28,7 @@ Our results show that WALL-OSS attains high success on complex long-horizon mani
28
  </video>
29
  <p><strong>WALL-OSS in Action: Demonstrating advanced manipulation capabilities and embodied AI performance</strong></p>
30
  </div>
31
- -->
32
 
33
 
34
  ## 🚀 Quick Start
 
1
+ # WALL-OSS
2
 
3
  <div align="left">
4
 
 
13
 
14
  </div>
15
 
16
+ ## [WALL-OSS: Igniting VLMs toward the Embodied Space](https://x2robot.cn-wlcb.ufileos.com/wall_oss.pdf)
17
 
18
  We introduce **WALL-OSS**, an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve (1) embodiment-aware vision--language understanding, (2) strong language--action association, and (3) robust manipulation capability.
19
  Our approach employs a tightly coupled architecture and multi-strategies training curriculum that enables Unified Cross-Level CoT—seamlessly unifying instruction reasoning, subgoal decomposition, and fine-grained action synthesis within a single differentiable framework.
20
  Our results show that WALL-OSS attains high success on complex long-horizon manipulations, demonstrates strong instruction-following capabilities, complex understanding and reasoning, and outperforms strong baselines, thereby providing a reliable and scalable path from VLMs to embodied foundation models.
21
 
22
+ ## 🎬 Video Demos
23
 
24
  <div align="center">
25
  <video width="80%" controls>
 
28
  </video>
29
  <p><strong>WALL-OSS in Action: Demonstrating advanced manipulation capabilities and embodied AI performance</strong></p>
30
  </div>
31
+
32
 
33
 
34
  ## 🚀 Quick Start