Add links to TRL docs
Browse files
app/src/content/article.mdx
CHANGED
|
@@ -76,7 +76,7 @@ Building on Universal Logit Distillation (ULD) [@boizard2025crosstokenizerdistil
|
|
| 76 |
|
| 77 |
Our key contributions are:
|
| 78 |
|
| 79 |
-
- Providing an open-source implementation of on-policy distillation methods and proving they work for multiple model combinations.
|
| 80 |
- Extending ULD to the on-policy setting, where we sample completions from the student and align them to the teacher's distribution.
|
| 81 |
- Implementing new sequence and vocabulary alignment methods that improve distillation performance when the student and the teacher have different tokenizers.
|
| 82 |
|
|
|
|
| 76 |
|
| 77 |
Our key contributions are:
|
| 78 |
|
| 79 |
+
- Providing an open-source implementation of on-policy distillation methods in TRL ([GKD](https://huggingface.co/docs/trl/en/gkd_trainer) and [GOLD](https://huggingface.co/docs/trl/main/en/gold_trainer)) and proving they work for multiple model combinations.
|
| 80 |
- Extending ULD to the on-policy setting, where we sample completions from the student and align them to the teacher's distribution.
|
| 81 |
- Implementing new sequence and vocabulary alignment methods that improve distillation performance when the student and the teacher have different tokenizers.
|
| 82 |
|