Running on CPU Upgrade 233 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 233 Explore synthetic data experiments on a virtual bookshelf
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 160
Running 3.84k The Ultra-Scale Playbook 🌌 3.84k The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade Featured 3.17k The Smol Training Playbook 📚 3.17k The secrets to building world-class LLMs
view article Article DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background NormalUhr • Feb 28, 2025 • 18