PACED: Distillation at the Frontier of Student Competence Paper • 2603.11178 • Published 2 days ago • 3
Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning Paper • 2602.21420 • Published 17 days ago • 6