view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective LinkedIn • Jan 27 • 74
view article Article Ulysses Sequence Parallelism: Training with Million-Token Contexts kashif, stas • Mar 9 • 28
Running on CPU Upgrade Featured 3.17k The Smol Training Playbook 📚 3.17k The secrets to building world-class LLMs
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 382
Running Featured 1.34k FineWeb: decanting the web for the finest text data at scale 🍷 1.34k Explore and download the FineWeb web‑text dataset
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 611
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
Running 3.84k The Ultra-Scale Playbook 🌌 3.84k The ultimate guide to training LLM on large GPU Clusters