FreedomIntelligence/TCM-Instruction-Tuning-ShizhenGPT Viewer • Updated Aug 25, 2025 • 246k • 168 • 13
Running on CPU Upgrade Featured 3.22k The Smol Training Playbook 📚 3.22k The secrets to building world-class LLMs
view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 488
Running 115 Unlocking On-Policy Distillation for Any Model Family 📝 115 Explore on-policy distillation visualization for any model
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.15k