Running 79 Unlocking On-Policy Distillation for Any Model Family 📝 79 Improve model performance by transferring knowledge between different model families
Running on CPU Upgrade Featured 2.94k The Smol Training Playbook 📚 2.94k The secrets to building world-class LLMs
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 224k • 1.56k
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation • 24B • Updated Apr 20, 2025 • 16 • • 59
bartowski/DeepSeek-R1-Distill-Qwen-32B-abliterated-GGUF Text Generation • Updated Jan 25, 2025 • 9.02k • 129