LudoBench / README.md
jpeper's picture
Upload folder using huggingface_hub
1c41eba verified
metadata
title: LudoBench
short_description: Multimodal Game Reasoning Benchmark [ICLR 2026]
emoji: 🎲
colorFrom: blue
colorTo: purple
sdk: static
pinned: false
license: mit

LudoBench

A multimodal board-game reasoning benchmark evaluating LLM/VLM reasoning across 5 strategy games and 3 difficulty tiers.

  • 638 annotated QA pairs
  • 5 games: Kingdomino, Res Arcana, Pax Renaissance, Carcassonne, Catan
  • 3 tiers: Environment Perception, Rules Integration, Short-Horizon Optimization
  • 9 models benchmarked across 3 modalities (None, Text, Image)