Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 109
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 59
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14, 2024 • 51
RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models Paper • 2407.05131 • Published Jul 6, 2024 • 26