view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand Dec 4, 2025 • 69
view article Article DeepSeek-V4: a million-token context that agents can actually use 12 days ago • 42
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 • 142
Running on CPU Upgrade Featured 3.15k The Smol Training Playbook 📚 3.15k The secrets to building world-class LLMs
view article Article How to Use Multiple GPUs in Hugging Face Transformers: Device Map vs Tensor Parallelism Feb 12 • 20
view article Article The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+ Feb 3 • 53