ACE-Brain commited on
Commit
f756d24
·
verified ·
1 Parent(s): d09e08b

Upload 9 files

Browse files
.gitattributes CHANGED
@@ -34,3 +34,10 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ assets/fig2.png filter=lfs diff=lfs merge=lfs -text
38
+ assets/radarchart.png filter=lfs diff=lfs merge=lfs -text
39
+ assets/table1.png filter=lfs diff=lfs merge=lfs -text
40
+ assets/table2.png filter=lfs diff=lfs merge=lfs -text
41
+ assets/table3.png filter=lfs diff=lfs merge=lfs -text
42
+ assets/table4.png filter=lfs diff=lfs merge=lfs -text
43
+ assets/teaser.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,113 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - ACE-Brain/ACE-Brain-8B
4
+ library_name: transformers
5
+ license: mit
6
+ ---
7
+
8
+ <div align="center">
9
+ <img src="./assets/acebrain.png" width=600>
10
+ </div>
11
+
12
+ <br/>
13
+
14
+ <div align="center" style="line-height: 1;">
15
+ |
16
+ <a href="https://huggingface.co/ACE-Brain/ACE-Brain-8B" target="_blank">🤗 HuggingFace</a>
17
+ &nbsp;|
18
+ <a href="|https://ace-brain.github.io/" target="_blank"> 📁 Project Page</a>
19
+ &nbsp;|
20
+ <a href="" target="_blank">📔 Technical Report</a>
21
+ &nbsp;|
22
+ <a href="https://github.com/ACE-BRAIN/ACE-Brain" target="_blank"> 🤖 Github</a>
23
+ &nbsp;|
24
+ <br/>
25
+ </div>
26
+
27
+ ## Overview
28
+
29
+ **ACE-Brain** is a spatial-centric multimodal foundation model designed to unify perception, reasoning, and decision-making across diverse embodied domains, including **spatial intelligence**, **embodied interaction**, **autonomous driving**, and **low-altitude sensing**. Built upon a unified multimodal large language model (MLLM) architecture, ACE-Brain learns a shared spatial reasoning substrate that enables generalization across heterogeneous physical environments and agent embodiments.
30
+
31
+ Extensive evaluation across **24** benchmarks demonstrates that ACE-Brain achieves state-of-the-art or competitive performance across multiple domains, validating its effectiveness as a unified embodied intelligence model.
32
+
33
+
34
+ <div align="center">
35
+ <img src="./assets/teaser.png" width=800>
36
+ </div>
37
+
38
+
39
+ ## Key Features
40
+
41
+
42
+ - Unified multimodal foundation model for embodied intelligence
43
+ - Strong spatial reasoning as a universal intelligence scaffold
44
+ - Supports diverse embodiment platforms:
45
+ - Spatial Intelligence
46
+ - Autonomous Driving
47
+ - UAV and Aerial Perception
48
+ - Embodied Interaction
49
+ - Cross-domain generalization across perception, reasoning, and planning
50
+ - Evaluated on 24 real-world embodied intelligence benchmarks
51
+
52
+ ## Core Capabilities
53
+
54
+ <div align="center">
55
+ <img src="./assets/fig2.png" width=800>
56
+ </div>
57
+
58
+ ## Performance Highlights
59
+
60
+ <div align="center">
61
+ <img src="./assets/radarchart.png" width=800>
62
+ </div>
63
+
64
+ ACE-Brain achieves strong performance across **24 benchmarks covering Spatial Intelligence, Embodied Interaction, Autonomous Driving, and Low-Altitude Sensing**, consistently outperforming existing open-source embodied VLMs and remaining competitive with closed-source models.
65
+
66
+ The model shows robust capability in **spatial reasoning, physical interaction understanding, task-oriented decision-making, and dynamic scene interpretation**, enabling reliable performance across diverse real-world embodiment scenarios.
67
+
68
+ In driving and aerial domains, ACE-Brain demonstrates excellent performance in **environment understanding, motion reasoning, and planning-aware prediction**, highlighting its effectiveness in complex, large-scale, and safety-critical environments.
69
+
70
+ Despite its domain specialization, ACE-Brain maintains strong general multimodal reasoning ability, confirming that spatial-centric training enhances overall visual-language intelligence rather than limiting generalization.
71
+
72
+ ### Spatial Benchmarks
73
+
74
+ <div align="center">
75
+ <img src="./assets/table1.png" width=800>
76
+ </div>
77
+
78
+
79
+ ### Autonomous Driving Benchmarks
80
+
81
+ <div align="center">
82
+ <img src="./assets/table2.png" width=800>
83
+ </div>
84
+
85
+
86
+ ### Low-Altitude Benchmarks
87
+
88
+ <div align="center">
89
+ <img src="./assets/table3.png" width=800>
90
+ </div>
91
+
92
+
93
+ ### Embodied Benchmarks
94
+
95
+ <div align="center">
96
+ <img src="./assets/table4.png" width=800>
97
+ </div>
98
+
99
+
100
+
101
+ > **Bold** numbers indicate the best results, <u>underlined</u> numbers indicate the second-best results, and results marked with \* are obtained using our evaluation framework.
102
+
103
+
104
+ ## Citation
105
+
106
+ ```bibtex
107
+ @article{gong2026acebrain,
108
+ title={ACE-Brain: A Spatial-Centric Foundation Brain for Universal Embodiments},
109
+ author={Gong, Ziyang and Luo, Zehang and Tang, Anke and Liu, Zhe and others},
110
+ journal={arXiv preprint arXiv:2502.xxxxx},
111
+ year={2026}
112
+ }
113
+ ```
assets/acebrain.png ADDED
assets/fig2.png ADDED

Git LFS Details

  • SHA256: 9f5b859c70acc499cd3ca79e2aa0d49b9e1af192589b9462c04539a306fb501c
  • Pointer size: 132 Bytes
  • Size of remote file: 1.82 MB
assets/radarchart.png ADDED

Git LFS Details

  • SHA256: 170f04c0206e8d415bcff3646fb63f8af55902bce0863dc0b122a5684a6eb600
  • Pointer size: 131 Bytes
  • Size of remote file: 940 kB
assets/table1.png ADDED

Git LFS Details

  • SHA256: c0e18e854d410594ff772d579ffda7f0484e1b52f873e1f0aded85b25006d7cc
  • Pointer size: 131 Bytes
  • Size of remote file: 208 kB
assets/table2.png ADDED

Git LFS Details

  • SHA256: c0aa61f8923f4dca1e00154271bb50e3d9e1f4a9edba455972fc4220a61427a8
  • Pointer size: 131 Bytes
  • Size of remote file: 202 kB
assets/table3.png ADDED

Git LFS Details

  • SHA256: 29b59f0f3003028f539dfbfcf2195846257ae589c3684c6240c3add104b11cf8
  • Pointer size: 131 Bytes
  • Size of remote file: 176 kB
assets/table4.png ADDED

Git LFS Details

  • SHA256: cd7ded01e3fdb35b0f2f9c9f833d0856ff743d96ad494ce27cc7be96d18aa36a
  • Pointer size: 131 Bytes
  • Size of remote file: 179 kB
assets/teaser.png ADDED

Git LFS Details

  • SHA256: db33fab4f49efa0e9efbc4bf54e59fad1b22c1b299d2ddd342e81ae7e617c877
  • Pointer size: 132 Bytes
  • Size of remote file: 1.16 MB