Running 78 Unlocking On-Policy Distillation for Any Model Family π 78 Improve model performance by transferring knowledge between different model families
Running on CPU Upgrade Featured 2.93k The Smol Training Playbook π 2.93k The secrets to building world-class LLMs
view article Article ChatML vs Harmony: Understanding the new Format from OpenAI π Aug 9, 2025 β’ 53
meituan-longcat/LongCat-Flash-Chat Text Generation β’ 562B β’ Updated Sep 24, 2025 β’ 28.3k β’ 520
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition β’ 6B β’ Updated Dec 10, 2025 β’ 214k β’ 1.56k
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation β’ 24B β’ Updated Apr 20, 2025 β’ 18 β’ β’ 59
bartowski/DeepSeek-R1-Distill-Qwen-32B-abliterated-GGUF Text Generation β’ Updated Jan 25, 2025 β’ 9.17k β’ 129