reaperdoesntknow commited on
Commit
02f9663
·
verified ·
1 Parent(s): a4f9fbf

Cross-link: DistilQwen collection spotlight — 2026-03-29

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -173,7 +173,7 @@ If you use TRL in your work, please cite the library:
173
 
174
  ## From the Convergent Intelligence Portfolio
175
 
176
- **[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads. Structure beats scale.
177
 
178
  Top model: [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) — 508 downloads
179
 
 
173
 
174
  ## From the Convergent Intelligence Portfolio
175
 
176
+ **[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Our only BF16 series. Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B on H100. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads. The rest of the portfolio proves structure beats scale on CPU. This collection shows what happens when you give the methodology real hardware.
177
 
178
  Top model: [Qwen3-1.7B-Coder-Distilled-SFT](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Coder-Distilled-SFT) — 508 downloads
179