Spaces:
Running
Running
File size: 831 Bytes
022c2d7 cffeecf 022c2d7 5906d8c 022c2d7 cffeecf 022c2d7 cffeecf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | ---
title: SpatialBench
emoji: 🧩
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: "5.23.3"
app_file: app.py
pinned: true
short_description: Do LLMs Build Spatial World Models? Evidence from Maze Tasks
---
# SpatialBench
Evaluation platform for **"Do LLMs Build Spatial World Models? Evidence from Grid-World Maze Tasks"** (ICLR 2026 Workshop).
Three tasks probe whether LLMs construct internal spatial representations:
| Task | Type | Description |
|------|------|-------------|
| **Maze Navigation** | Planning | Find shortest path from start to goal |
| **Sequential Point Reuse** | Reasoning | Q3 = Q0 — do models reuse earlier computation? |
| **Compositional Distance** | Reasoning | Compose corner→center distances for Q2 |
Models evaluated: Gemini 2.5 Flash, GPT-5 Mini, Claude Haiku 4.5, DeepSeek Chat.
|