Cross-link: DistilQwen collection spotlight — 2026-03-29
Browse files
README.md
CHANGED
|
@@ -115,7 +115,7 @@ This model was designed and built from Discrepancy Analysis, paper to be publish
|
|
| 115 |
|
| 116 |
## From the Convergent Intelligence Portfolio
|
| 117 |
|
| 118 |
-
**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads.
|
| 119 |
|
| 120 |
Top model: [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) — 508 downloads
|
| 121 |
|
|
|
|
| 115 |
|
| 116 |
## From the Convergent Intelligence Portfolio
|
| 117 |
|
| 118 |
+
**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Our only BF16 series. Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B on H100. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads. The rest of the portfolio proves structure beats scale on CPU. This collection shows what happens when you give the methodology real hardware.
|
| 119 |
|
| 120 |
Top model: [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) — 508 downloads
|
| 121 |
|