Cross-link: DistilQwen collection spotlight — 2026-03-29
Browse files
README.md
CHANGED
|
@@ -166,3 +166,19 @@ If you use TRL in your work, please cite the library:
|
|
| 166 |
|
| 167 |
|
| 168 |
*Last updated: 2026-03-28 12:56 UTC*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 166 |
|
| 167 |
|
| 168 |
*Last updated: 2026-03-28 12:56 UTC*
|
| 169 |
+
|
| 170 |
+
<!-- CIX-CROSSLINK-START -->
|
| 171 |
+
|
| 172 |
+
---
|
| 173 |
+
|
| 174 |
+
## From the Convergent Intelligence Portfolio
|
| 175 |
+
|
| 176 |
+
**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads. Structure beats scale.
|
| 177 |
+
|
| 178 |
+
Top model: [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) — 508 downloads
|
| 179 |
+
|
| 180 |
+
Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)
|
| 181 |
+
|
| 182 |
+
*Convergent Intelligence LLC: Research Division*
|
| 183 |
+
|
| 184 |
+
<!-- CIX-CROSSLINK-END -->
|