File size: 526 Bytes
e74b832
c9a41bc
 
 
 
e74b832
 
c9a41bc
e74b832
 
c9a41bc
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
title: "LudoBench: Board Game Reasoning Benchmark"
emoji: "\U0001F3B2"
colorFrom: blue
colorTo: purple
sdk: static
pinned: false
license: mit
---

# LudoBench

A multimodal board-game reasoning benchmark evaluating LLM/VLM reasoning across 5 strategy games and 3 difficulty tiers.

- 638 annotated QA pairs
- 5 games: Kingdomino, Res Arcana, Pax Renaissance, Carcassonne, Catan
- 3 tiers: Environment Perception, Rules Integration, Short-Horizon Optimization
- 9 models benchmarked across 3 modalities (None, Text, Image)