Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 8 days ago • 54
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 8 days ago • 54
On Predictability of Reinforcement Learning Dynamics for Large Language Models Paper • 2510.00553 • Published Oct 1, 2025 • 9