Reinforcement Learning Foundations for Deep Research Systems: A Survey Paper • 2509.06733 • Published Sep 8, 2025 • 32
DivMerge: A divergence-based model merging method for multi-tasking Paper • 2509.02108 • Published Sep 2, 2025 • 25
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11, 2025 • 44
WebSailor: Navigating Super-human Reasoning for Web Agent Paper • 2507.02592 • Published Jul 3, 2025 • 123
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development Paper • 2506.05010 • Published Jun 5, 2025 • 80
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 188