A^2TGPO: Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping Paper • 2605.06200 • Published 3 days ago • 10
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 7 days ago • 145
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search Paper • 2601.04767 • Published Jan 8 • 28
LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction Paper • 2509.07403 • Published Sep 9, 2025 • 35
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents Paper • 2505.23923 • Published May 29, 2025 • 8