ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning Paper • 2605.20342 • Published May 19 • 34
SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization Paper • 2601.22491 • Published Jan 30 • 12