DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 254
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 102
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources Paper • 2509.21268 • Published Sep 25, 2025 • 104
GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning Paper • 2509.17437 • Published Sep 22, 2025 • 17
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published Jul 30, 2025 • 47
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published Dec 31, 2024 • 46
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published Jul 30, 2025 • 47
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? Paper • 2506.05287 • Published Jun 5, 2025 • 14
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8, 2025 • 113
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning Paper • 2506.09513 • Published Jun 11, 2025 • 101
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning Paper • 2506.09513 • Published Jun 11, 2025 • 101
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8, 2025 • 113
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? Paper • 2506.05287 • Published Jun 5, 2025 • 14
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Paper • 2501.00599 • Published Dec 31, 2024 • 46