Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR Paper • 2605.15726 • Published 15 days ago • 34
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Paper • 2605.29250 • Published 2 days ago • 52
Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents Paper • 2605.28775 • Published 3 days ago • 34
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 3 days ago • 76
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search Paper • 2603.22341 • Published Mar 21 • 37
ACON: Optimizing Context Compression for Long-horizon LLM Agents Paper • 2510.00615 • Published Oct 1, 2025 • 35
Rethinking Reward Models for Multi-Domain Test-Time Scaling Paper • 2510.00492 • Published Oct 1, 2025 • 28