Asymmetric On-Policy Distillation: Bridging Exploitation and Imitation at the Token Level Paper • 2605.06387 • Published 6 days ago • 2
TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas Paper • 2603.16448 • Published Mar 17 • 58
DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems Paper • 2512.06749 • Published Dec 7, 2025 • 28