Running 3.67k The Ultra-Scale Playbook š 3.67k The ultimate guide to training LLM on large GPU Clusters
Runtime error Featured 2.95k The Smol Training Playbook š 2.95k The secrets to building world-class LLMs
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B Text Generation ⢠8B ⢠Updated May 29, 2025 ⢠133k ⢠⢠1.03k