Instructions to use Alibaba-DAMO-Academy/RynnBrain-Nav-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Alibaba-DAMO-Academy/RynnBrain-Nav-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Alibaba-DAMO-Academy/RynnBrain-Nav-8B", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Alibaba-DAMO-Academy/RynnBrain-Nav-8B", trust_remote_code=True) model = AutoModelForImageTextToText.from_pretrained("Alibaba-DAMO-Academy/RynnBrain-Nav-8B", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Alibaba-DAMO-Academy/RynnBrain-Nav-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Alibaba-DAMO-Academy/RynnBrain-Nav-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Alibaba-DAMO-Academy/RynnBrain-Nav-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Alibaba-DAMO-Academy/RynnBrain-Nav-8B
- SGLang
How to use Alibaba-DAMO-Academy/RynnBrain-Nav-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Alibaba-DAMO-Academy/RynnBrain-Nav-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Alibaba-DAMO-Academy/RynnBrain-Nav-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Alibaba-DAMO-Academy/RynnBrain-Nav-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Alibaba-DAMO-Academy/RynnBrain-Nav-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use Alibaba-DAMO-Academy/RynnBrain-Nav-8B with Docker Model Runner:
docker model run hf.co/Alibaba-DAMO-Academy/RynnBrain-Nav-8B
RynnBrain: Open Embodied Foundation Models
If you like our project, please give us a star ⭐ on Github for the latest update.
📰 News
- [2026.02.02] Release RynnBrain family weights and inference code.
- [2026.02.02] Add cookbooks for cognition, localization, reasoning, and planning.
✨ Introduction
RynnBrain aims to serve as a physics-aware embodied brain: it observes egocentric scenes, grounds language to physical space and time, and supports downstream robotic systems with reliable localization and planning outputs.
Key Highlights
Comprehensive egocentric understanding
Strong spatial comprehension and egocentric cognition across embodied QA, counting, OCR, and fine-grained video understanding.Diverse spatiotemporal localization
Locates objects, target areas, and predicts trajectories across long episodic context, enabling global spatial awareness.Physical-space grounded reasoning (RynnBrain family)
The broader RynnBrain family includes “Thinking” variants that interleave textual reasoning with spatial grounding to anchor reasoning in reality.Physics-aware precise planning (RynnBrain family)
Integrates localized affordances/areas/objects into planning outputs to provide downstream VLA models with precise instructions.
🌎 Model Zoo
| Model | Base Model | Huggingface | Modelscope |
|---|---|---|---|
| RynnBrain-2B | Qwen3-VL-2B-Instruct | Link | Link |
| RynnBrain-4B | Qwen3-VL-4B-Instruct | Link | Link |
| RynnBrain-8B | Qwen3-VL-8B-Instruct | Link | Link |
| RynnBrain-30B-A3B | Qwen3-VL-30B-A3B-Instruct | Link | Link |
| RynnBrain‑CoP-8B | RynnBrain-8B | Link | Link |
| RynnBrain‑Plan-8B | RynnBrain-8B | Link | Link |
| RynnBrain‑Plan-30B-A3B | RynnBrain-30B-A3B | Link | Link |
| RynnBrain‑Nav-8B (This Checkpoint) | RynnBrain-8B | Link | Link |
🚀 Main Results
🤖 Quick Start
Minimal dependencies:
pip install transformers==4.57.1
Run text generation:
from transformers import AutoModelForImageTextToText
model = AutoModelForImageTextToText.from_pretrained("")
...
Cookbooks
Checkout the cookbooks that showcase RynnBrain's capabilities in cognition, localization, reasoning, and planning.
| Category | Cookbook name | Description |
|---|---|---|
| Planning | 11_visual_language_navigation.ipynb | Combines vision and language instructions to perform navigation and path planning. |
📑 Citation
If you find RynnBrain useful for your research and applications, please cite using this BibTeX:
@article{damo2026rynnbrain,
title={RynnBrain: Open Embodied Foundation Models},
author={Ronghao Dang, Jiayan Guo, Bohan Hou, Sicong Leng, Kehan Li, Xin Li, Jiangpin Liu, Yunxuan Mao, Zhikai Wang, Yuqian Yuan, Minghao Zhu, Xiao Lin, Yang Bai, Qian Jiang, Yaxi Zhao, Minghua Zeng, Junlong Gao, Yuming Jiang, Jun Cen, Siteng Huang, Liuyi Wang, Wenqiao Zhang, Chengju Liu, Jianfei Yang, Shijian Lu, Deli Zhao},
journal={arXiv preprint arXiv:2602.14979v1},
year={2026},
url = {https://arxiv.org/abs/2602.14979v1}
}
- Downloads last month
- 9,238