One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining Paper • 2606.30634 • Published 2 days ago • 18
Reasoning Shift: How Context Silently Shortens LLM Reasoning Paper • 2604.01161 • Published Apr 1 • 32