Running 78 Unlocking On-Policy Distillation for Any Model Family 📝 78 Improve model performance by transferring knowledge between different model families
view article Article Exploring Direct Tensor Manipulation in Language Models: A Case Study in Binary-Level Model Enhancement Nov 7, 2025 • 4