A newer version of the Gradio SDK is available: 6.12.0
| model_name | score | VSI [66] | SITE [57] | MMSI [68] | OmniSpatial [23] | MindCube ∗ [69] | STARE [32] | CoreCognition [33] | SpatialViz [55] | source_title | source_url | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Human | 79.2 | 79.2 | 67.5 | 97.2 | 92.63 | 94.55 | 96.5 | 86.98 | 82.46 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| Qwen3-8B-Instruct [65] | 57.9 | 57.9 | 45.83 | 31.1 | 45.73 | 29.42 | 39.76 | 69.67 | 17.54 † | EASI | https://arxiv.org/pdf/2508.13142.pdf | † indicates cases where generations were truncated due to overlong chains of thought, yielding no final answer; such instances are counted as incorrect, which depresses the score. |
| InternVL3.5-8B [56] | 56.05 | 56.05 | 43.79 | 27.3 | 46.71 | 42.5 | 40.18 | 66.4 | 23.98 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| GPT-5-2025-08-07 [45] | 55.03 | 55.03 | 61.88 | 41.8 | 59.9 | 56.3 | 54.59 | 84.37 | 51.27 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| Gemini-2.5-pro-2025-06 [52] | 53.57 | 53.57 | 57.06 | 38 | 55.38 | 57.6 | 49.14 | 76.7 | 42.71 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| Seed-1.6-2025-06-15 [51] | 49.91 | 49.91 | 54.61 | 38.3 | 49.32 | 48.75 | 46.06 | 77.17 | 34.58 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| GPT-5-mini-2025-08-07 [45] | 48.67 | 48.67 | 52.47 | 34.1 | 55.52 | 56.69 | 52.51 | 77.77 | 44.66 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| Grok-4-2025-07-09 [62] | 47.92 | 47.92 | 47.01 | 37.8 | 46.84 | 63.56 | 26.9 | 79.27 | 19.40 † | EASI | https://arxiv.org/pdf/2508.13142.pdf | † indicates cases where generations were truncated due to overlong chains of thought, yielding no final answer; such instances are counted as incorrect, which depresses the score. |
| InternVL3-78B [79] | 47.55 | 47.55 | 52.72 | 30.5 | 50.95 | 49.52 | 42 | 71.16 | 31.10 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| GPT-5-nano-2025-08-07 [45] | 43.22 | 43.22 | 35.81 | 28.9 | 47.81 | 41.48 | 46.05 | 67.92 | 35.59 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| InternVL3-8B [79] | 42.14 | 42.14 | 41.15 | 28 | 46.25 | 41.54 | 41.36 | 60.92 | 30.00 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| Qwen2.5-VL-72B-Instruct [1] | 35.77 | 35.77 | 47.41 | 32.5 | 47.81 | 42.4 | 38.37 | 69.22 | 32.54 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| Random Choice | 34 | 34 | 0 | 25 | 24.98 | 32.35 | 34.8 | 33.93 | 25.08 | EASI | https://arxiv.org/pdf/2508.13142.pdf | VSI random choice here is chance level(Frequency). |
| Qwen2.5-VL-7B-Instruct [1] | 32.3 | 32.3 | 37.64 | 26.8 | 39.07 | 36.05 | 35.03 | 62.16 | 26.78 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |
| Qwen2.5-VL-3B-Instruct [1] | 27 | 27 | 33.14 | 28.6 | 42.47 | 37.6 | 37.83 | 60.19 | 21.86 | EASI | https://arxiv.org/pdf/2508.13142.pdf | None |