The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning Paper • 2604.06427 • Published 3 days ago • 4
Efficient RLVR Training via Weighted Mutual Information Data Selection Paper • 2603.01907 • Published Mar 2 • 14
Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning Paper • 2601.21037 • Published Jan 28 • 15
Agentic Policy Optimization via Instruction-Policy Co-Evolution Paper • 2512.01945 • Published Dec 1, 2025 • 4
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published Aug 10, 2025 • 99