Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation Paper • 2606.06712 • Published 5 days ago • 1