Running 3.86k The Ultra-Scale Playbook 🌌 3.86k The ultimate guide to training LLM on large GPU Clusters
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 184k • 2.86k
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 14.7k • 162
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention sirluk • Oct 7, 2024 • 71
Running on CPU Upgrade Featured 3.19k The Smol Training Playbook 📚 3.19k The secrets to building world-class LLMs