T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning Paper • 2605.02178 • Published 11 days ago • 10
The Amazing Agent Race: Strong Tool Users, Weak Navigators Paper • 2604.10261 • Published 28 days ago • 7
Abstain-R1: Calibrated Abstention and Post-Refusal Clarification via Verifiable RL Paper • 2604.17073 • Published 27 days ago • 9
Abstain-R1: Calibrated Abstention and Post-Refusal Clarification via Verifiable RL Paper • 2604.17073 • Published 27 days ago • 9
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16, 2025 • 276
The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents Paper • 2604.10577 • Published Apr 12 • 25
The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents Paper • 2604.10577 • Published Apr 12 • 25