File size: 3,989 Bytes
b83d1de
 
 
 
 
 
 
 
f4e0c96
b83d1de
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ab43a6a
b83d1de
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a2c3b07
 
 
 
 
 
 
 
 
b83d1de
 
 
 
f4e0c96
b83d1de
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
license: apache-2.0
base_model:
- Qwen/Qwen2.5-VL-3B-Instruct
---

# SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

<a href="https://arxiv.org/abs/2604.14144" target="_blank">
    <img alt="Paper" src="https://img.shields.io/badge/arXiv-SpatialEvo-red?logo=arxiv" height="20" />
</a>
<a href="https://github.com/ZJU-REAL/SpatialEvo" target="_blank">
    <img alt="Code" src="https://img.shields.io/badge/Code-SpatialEvo-white?logo=github" height="20" />
</a>
<a href="https://huggingface.co/lidingm/SpatialEvo-7B" target="_blank">
    <img alt="Model" src="https://img.shields.io/badge/%F0%9F%A4%97%20_Model-SpatialEvo_7B-ffc107?color=ffc107&logoColor=white" height="20" />
</a>
<a href="https://huggingface.co/datasets/lidingm/SpatialEvo-160K" target="_blank">
    <img alt="Data" src="https://img.shields.io/badge/%F0%9F%A4%97%20_Data-SpatialEvo_160k-ffc107?color=ffc107&logoColor=white" height="20" />
</a>

## SpatialEvo-3B


This repository contains **SpatialEvo-3B**, introduced in [SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments](https://arxiv.org/abs/2604.14144).

## Model Description

SpatialEvo-3B is fine-tuned from **Qwen2.5-VL-3B-Instruct** using the SpatialEvo self-evolving framework. Instead of relying on manually annotated datasets or model voting to construct pseudo-labels, SpatialEvo leverages a **Deterministic Geometric Environment (DGE)** that programmatically computes exact ground truth from 3D point clouds and camera poses, enabling zero-noise online reinforcement learning across 16 spatial reasoning task categories.

A single shared-parameter policy co-evolves as both a **Questioner** and a **Solver** under GRPO optimization, while a lightweight **Task Scheduler** drives adaptive curriculum learning based on historical accuracy — without any manual stage design or human annotation.

## Performance

| Benchmark | Baseline | SpatialLadder | SpaceR | SpatialSSRL | **SpatialEvo (Ours)** |
|-----------|----------|---------------|--------|-------------|----------------------|
| VSI-Bench | 28.1 | **45.7** | 36.0 | 28.0 | 39.2 |
| RealWorldQA | 63.4 | 57.1 | 61.4 | 65.4 | **66.5** |
| EmbSpatial | 55.9 | 57.6 | 55.6 | 59.8 | **61.2** |
| SpatialViz | 24.2 | 28.6 | **31.9** | 25.9 | 25.4 |
| STARE | 33.1 | 26.4 | 36.8 | 36.8 | **36.9** |
| CoreCognition | 56.8 | **58.3** | 29.1 | 57.6 | 57.4 |
| ViewSpatial | 36.2 | 43.0 | 35.9 | 38.4 | **42.3** |
| V-STAR | 74.9 | 36.7 | 75.4 | **77.0** | 75.4 |
| MMStar | 54.6 | 45.8 | 44.9 | **56.5** | 55.2 |
| **AVG** | 47.5 | 44.4 | 45.2 | 49.5 | **51.1** |

All baselines are evaluated on Qwen2.5-VL-3B. **Bold** denotes the best result per benchmark.

## Usage

```python
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration

model = Qwen2_5_VLForConditionalGeneration.from_pretrained("lidingm/SpatialEvo-3B")
processor = AutoProcessor.from_pretrained("lidingm/SpatialEvo-3B")
```

## Citation

If you find SpatialEvo useful, please consider citing our work:

```bibtex
@misc{li2026spatialevoselfevolvingspatialintelligence,
      title={SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments}, 
      author={Dinging Li and Yingxiu Zhao and Xinrui Cheng and Kangheng Lin and Hongbo Peng and Hongxing Li and Zixuan Wang and Yuhong Dai and Haodong Li and Jia Wang and Yukang Shi and Liang Zhao and Jianjian Sun and Zheng Ge and Xiangyu Zhang and Weiming Lu and Jun Xiao and Yueting Zhuang and Yongliang Shen},
      year={2026},
      eprint={2604.14144},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2604.14144}, 
}
```

## Related Resources

- 📄 [Paper](https://arxiv.org/abs/2604.14144)
- 💻 [GitHub Repository](https://github.com/ZJU-REAL/SpatialEvo)
- 🤗 [SpatialEvo-7B](https://huggingface.co/lidingm/SpatialEvo-7B)
- 🤗 [SpatialEvo-160K Dataset](https://huggingface.co/datasets/lidingm/SpatialEvo-160K)