Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 11 days ago • 186
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 10 days ago • 55
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 16 days ago • 186
Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design Paper • 2603.28376 • Published Mar 30 • 24