QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks Paper • 2605.24218 • Published May 22 • 46
When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents Paper • 2602.08235 • Published Feb 9 • 1
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published Mar 25 • 99
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model Paper • 2509.00676 • Published Aug 31, 2025 • 85