yyuncong commited on
Commit
060dcf7
·
verified ·
1 Parent(s): da0409c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-to-video
4
+ library_name: diffusers
5
+ ---
6
+
7
+ <br/>
8
+ <h1 align="center" style="font-size: 1.7rem">MindJourney: Test-Time Scaling with World Models for Spatial Reasoning</h1>
9
+ <p align="center">
10
+ NeurIPS 2025
11
+ </p>
12
+ <p align="center">
13
+ <a href="https://yyuncong.github.io/">Yuncong Yang</a>,
14
+ <a href="https://jiagengliu02.github.io/">Jiageng Liu</a>,
15
+ <a href="https://cozheyuanzhangde.github.io/">Zheyuan Zhang</a>,
16
+ <a href="https://rainbow979.github.io/">Siyuan Zhou</a>,
17
+ <a href="https://cs-people.bu.edu/rxtan/">Reuben Tan</a>,
18
+ <a href="https://jwyang.github.io/">Jianwei Yang</a>,
19
+ <a href="https://yilundu.github.io/">Yilun Du</a>,
20
+ <a href="https://people.csail.mit.edu/ganchuang">Chuang Gan</a>
21
+ </p>
22
+ <p align="center">
23
+ <a href="https://arxiv.org/abs/2507.12508">
24
+ <img src='https://img.shields.io/badge/Paper-PDF-red?style=flat&logo=arXiv&logoColor=red' alt='Paper PDF'>
25
+ </a>
26
+ <a href='https://umass-embodied-agi.github.io/MindJourney/' style='padding-left: 0.5rem;'>
27
+ <img src='https://img.shields.io/badge/Project-Page-blue?style=flat&logo=Google%20chrome&logoColor=blue' alt='Project Page'>
28
+ </a>
29
+ </p>
30
+
31
+
32
+ </p>
33
+
34
+
35
+ MindJourney is a test-time scaling framework that leverages the 3D imagination capability of World Models to strengthen spatial reasoning in Vision-Language Models (VLMs). We evaluate on the SAT dataset and provide a baseline pipeline, a Stable Virtual Camera (SVC) based spatial beam search pipeline, and a Search World Model (SWM) based spatial beam search pipeline.