Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published May 13 • 165
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published Apr 29 • 112
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 113
Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision Paper • 2604.12002 • Published Apr 13 • 12