GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent Paper • 2603.13875 • Published Mar 14 • 36
Running on CPU Upgrade 233 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 233 Explore synthetic data experiments on a virtual bookshelf
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published Feb 27 • 89
view article Article Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek huggingface • Jan 27 • 45
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels drbh, danieldk • Aug 18, 2025 • 98
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF Text Generation • 31B • Updated Jan 30 • 172k • 657
Running 3.84k The Ultra-Scale Playbook 🌌 3.84k The ultimate guide to training LLM on large GPU Clusters