SamsungSAILMontreal
/

Qwen3-Coder-Next-REAP

Text Generation

Mixture of Experts

Model card Files Files and versions

bknyaz commited on Feb 11

Commit

7ad1a54

·

verified ·

1 Parent(s): dc828fd

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ base_model:
 This model is a compressed version of [Qwen/Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next).
 It is obtained by reducing the number of experts in each MoE layer from 512 to 384 using the REAP baseline method as described in https://bknyaz.github.io/blog/2026/moe/.
-**Compared to other models obtained in this collection, more coding sequences used in the calibration data during pruning/merging
 to better preserve original's model coding abilities. Specifically, the ratio between c4, math and coding data (see https://bknyaz.github.io/blog/2026/moe/) is 0.0, 0.7, 0.3.
 The calibration data used here is the same as in our [Qwen3-Coder-Next-REAM](https://huggingface.co/SamsungSAILMontreal/Qwen3-Coder-Next-REAM).**

 This model is a compressed version of [Qwen/Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next).
 It is obtained by reducing the number of experts in each MoE layer from 512 to 384 using the REAP baseline method as described in https://bknyaz.github.io/blog/2026/moe/.
+**Compared to other models obtained in this collection, more coding data is used in the calibration data during pruning/merging
 to better preserve original's model coding abilities. Specifically, the ratio between c4, math and coding data (see https://bknyaz.github.io/blog/2026/moe/) is 0.0, 0.7, 0.3.
 The calibration data used here is the same as in our [Qwen3-Coder-Next-REAM](https://huggingface.co/SamsungSAILMontreal/Qwen3-Coder-Next-REAM).**