Instructions to use ACE-Brain/ACE-Brain-0-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ACE-Brain/ACE-Brain-0-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="ACE-Brain/ACE-Brain-0-8B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("ACE-Brain/ACE-Brain-0-8B")
model = AutoModelForMultimodalLM.from_pretrained("ACE-Brain/ACE-Brain-0-8B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ACE-Brain/ACE-Brain-0-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ACE-Brain/ACE-Brain-0-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ACE-Brain/ACE-Brain-0-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/ACE-Brain/ACE-Brain-0-8B

SGLang

How to use ACE-Brain/ACE-Brain-0-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ACE-Brain/ACE-Brain-0-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ACE-Brain/ACE-Brain-0-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ACE-Brain/ACE-Brain-0-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ACE-Brain/ACE-Brain-0-8B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use ACE-Brain/ACE-Brain-0-8B with Docker Model Runner:
```
docker model run hf.co/ACE-Brain/ACE-Brain-0-8B
```

Overview

ACE-Brain-0 is a generalist multimodal foundation model designed to unify perception, reasoning, and decision-making across diverse embodied domains, including spatial cognition, autonomous driving, low-altitude sensing and embodied interaction. Built upon a unified multimodal large language model (MLLM) architecture, ACE-Brain learns a shared spatial reasoning substrate that enables generalization across heterogeneous physical environments and agent embodiments.

Extensive evaluation across 24 benchmarks demonstrates that ACE-Brain achieves state-of-the-art or competitive performance across multiple domains, validating its effectiveness as a unified embodied intelligence model.

Key Features

Unified multimodal foundation model for embodied intelligence
Strong spatial reasoning as a universal intelligence scaffold
Supports diverse embodiment platforms:
- Spatial Cognition
- Autonomous Driving
- Low-Altitude Sensing
- Embodied Interaction
Cross-domain generalization across perception, reasoning, and planning

Performance Highlights

ACE-Brain achieves strong performance across 24 benchmarks covering Spatial Cognition, Autonomous Driving, Low-Altitude Sensing and Embodied Interaction, consistently outperforming existing open-source embodied VLMs and remaining competitive with closed-source models.

The model shows robust capability in spatial reasoning, physical interaction understanding, task-oriented decision-making, and dynamic scene interpretation, enabling reliable performance across diverse real-world embodiment scenarios.

In driving and aerial domains, ACE-Brain demonstrates excellent performance in environment understanding, motion reasoning, and planning-aware prediction, highlighting its effectiveness in complex, large-scale, and safety-critical environments.

Despite its domain specialization, ACE-Brain maintains strong general multimodal reasoning ability, confirming that spatial intelligence-based training enhances overall visual-language intelligence rather than limiting generalization.

Spatial Benchmarks

Autonomous Driving Benchmarks

Low-Altitude Benchmarks

Embodied Benchmarks

Bold numbers indicate the best results, underlined numbers indicate the second-best results, and results marked with * are obtained using our evaluation framework.

Inference Example

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor

# default: Load the model on the available device(s)
model = Qwen3VLForConditionalGeneration.from_pretrained(
    "ACE-Brain/ACE-Brain-0-8B", dtype="auto", device_map="auto"
)

processor = AutoProcessor.from_pretrained("ACE-Brain/ACE-Brain-0-8B")

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Describe this image."},
        ],
    }
]

# Preparation for inference
inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_dict=True,
    return_tensors="pt"
)
inputs = inputs.to(model.device)

# Inference: Generation of the output
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
    out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)

Citation

@misc{gong2026acebrain0spatialintelligenceshared,
      title={ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments}, 
      author={Ziyang Gong and Zehang Luo and Anke Tang and Zhe Liu and Shi Fu and Zhi Hou and Ganlin Yang and Weiyun Wang and Xiaofeng Wang and Jianbo Liu and Gen Luo and Haolan Kang and Shuang Luo and Yue Zhou and Yong Luo and Li Shen and Xiaosong Jia and Yao Mu and Xue Yang and Chunxiao Liu and Junchi Yan and Hengshuang Zhao and Dacheng Tao and Xiaogang Wang},
      year={2026},
      eprint={2603.03198},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.03198}, 
}

Downloads last month: 29

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for ACE-Brain/ACE-Brain-0-8B

Unable to build the model tree, the base model loops to the model itself. Learn more.

Paper for ACE-Brain/ACE-Brain-0-8B

ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments

Paper • 2603.03198 • Published Mar 3 • 4