on-policy-distillation

Running

cmpatino HF Staff commited on Oct 29, 2025

Commit

4bed84f

1 Parent(s): b9a88c9

Import HTML embeds

Files changed (1) hide show

app/src/content/article.mdx CHANGED Viewed

@@ -48,6 +48,8 @@ tableOfContentsAutoCollapse: true
 pdfProOnly: false
 ---
 On-policy distillation is a highly effective strategy for compressing LLMs, as recently highlighted by [Thinking Machines' excellent blog post.](https://thinkingmachines.ai/blog/on-policy-distillation/) The technique trains a small "student" model by transferring knowledge from a high-performing "teacher" model's probability distribution. This allows the student to emulate the teacher's task performance, while significantly reducing size and latency.

 pdfProOnly: false
 ---
+import HtmlEmbed from '../components/HtmlEmbed.astro'
 On-policy distillation is a highly effective strategy for compressing LLMs, as recently highlighted by [Thinking Machines' excellent blog post.](https://thinkingmachines.ai/blog/on-policy-distillation/) The technique trains a small "student" model by transferring knowledge from a high-performing "teacher" model's probability distribution. This allows the student to emulate the teacher's task performance, while significantly reducing size and latency.