The PokeAgent Challenge: Competitive and Long-Context Learning at Scale Paper • 2603.15563 • Published 27 days ago • 10
The PokeAgent Challenge: Competitive and Long-Context Learning at Scale Paper • 2603.15563 • Published 27 days ago • 10
Ego4D: Around the World in 3,000 Hours of Egocentric Video Paper • 2110.07058 • Published Oct 13, 2021 • 1
ICONS: Influence Consensus for Vision-Language Data Selection Paper • 2501.00654 • Published Dec 31, 2024
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains? Paper • 2410.03859 • Published Oct 4, 2024 • 1
Explain Before You Answer: A Survey on Compositional Visual Reasoning Paper • 2508.17298 • Published Aug 24, 2025 • 4
Beyond Objects: Contextual Synthetic Data Generation for Fine-Grained Classification Paper • 2510.24078 • Published Oct 28, 2025 • 3
GameDevBench: Evaluating Agentic Capabilities Through Game Development Paper • 2602.11103 • Published Feb 11 • 15
LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra Paper • 2507.15815 • Published Jul 21, 2025 • 7
COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning Paper • 2504.21850 • Published Apr 30, 2025 • 27
FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning Paper • 2406.02081 • Published Jun 4, 2024
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Paper • 2406.18521 • Published Jun 26, 2024 • 30