Running on CPU Upgrade Featured 2.93k The Smol Training Playbook 📚 2.93k The secrets to building world-class LLMs
moonshotai/Kimi-Linear-48B-A3B-Instruct Text Generation • 49B • Updated Dec 16, 2025 • 37.6k • 530
OpenGVLab/InternVL3_5-241B-A28B Image-Text-to-Text • 241B • Updated Aug 29, 2025 • 1.45k • 134
view article Article You could have designed state of the art positional encoding Nov 25, 2024 • 442
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4, 2025 • 28
moonshotai/Kimi-VL-A3B-Thinking-2506 Image-Text-to-Text • 16B • Updated Aug 18, 2025 • 166k • 339
Running 3.66k The Ultra-Scale Playbook 🌌 3.66k The ultimate guide to training LLM on large GPU Clusters
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 251