Rethinking RL for LLM Reasoning: It's Sparse Policy Selection, Not Capability Learning Paper • 2605.06241 • Published 7 days ago • 3
EvoClaw: Evaluating AI Agents on Continuous Software Evolution Paper • 2603.13428 • Published Mar 13 • 21
LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning Paper • 2512.05325 • Published Dec 5, 2025 • 5