Learning from the Self-future: On-policy Self-distillation for dLLMs Paper • 2606.18195 • Published 15 days ago • 76
EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments Paper • 2606.13681 • Published 20 days ago • 142
HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs Paper • 2605.28398 • Published May 27 • 15
WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors Paper • 2605.10434 • Published May 11 • 29
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published Apr 30 • 92
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome Paper • 2603.28407 • Published Mar 30 • 72
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier Paper • 2603.03756 • Published Mar 4 • 90
Enhancing Spatial Understanding in Image Generation via Reward Modeling Paper • 2602.24233 • Published Feb 27 • 60
VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning Paper • 2601.22069 • Published Jan 29 • 7
Language-based Trial and Error Falls Behind in the Era of Experience Paper • 2601.21754 • Published Jan 29 • 16
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published Jan 13 • 150
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper • 2601.09667 • Published Jan 14 • 92
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published Jan 14 • 128
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published Dec 3, 2025 • 160
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published Nov 20, 2025 • 96
AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization Paper • 2504.21659 • Published Apr 30, 2025 • 14
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published Dec 24, 2024 • 39
WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents Paper • 2410.07484 • Published Oct 9, 2024 • 51