Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories Paper • 2606.02060 • Published 3 days ago • 34 • 2
FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search Paper • 2606.00660 • Published 5 days ago • 8 • 2
Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems Paper • 2606.00090 • Published 12 days ago • 6 • 3