GD^2PO: Mitigating Multi-Reward Conflicts via Group-Dynamic reward-Decoupled Policy Optimization Paper • 2606.16771 • Published 14 days ago • 13
Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill Paper • 2606.03980 • Published 27 days ago • 13
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published Mar 26 • 56