TACO: Tool-Augmented Credit Optimization for Agentic Tool Use Paper • 2606.30251 • Published 5 days ago • 20
OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning Paper • 2606.26790 • Published 9 days ago • 54
Maestro: Reinforcement Learning to Orchestrate Hierarchical Model-Skill Ensembles Paper • 2605.22177 • Published May 21 • 21
Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning Paper • 2601.20209 • Published Jan 28 • 24
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning Paper • 2601.03872 • Published Jan 7 • 45