\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper • 2603.07980 • Published 1 day ago • 20
Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability Paper • 2602.02477 • Published Feb 2 • 10