Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ base_model:
|
|
| 15 |
This model is a compressed version of [Qwen/Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next).
|
| 16 |
It is obtained by reducing the number of experts in each MoE layer from 512 to 384 using the REAP baseline method as described in https://bknyaz.github.io/blog/2026/moe/.
|
| 17 |
|
| 18 |
-
**Compared to other models obtained in this collection, more coding
|
| 19 |
to better preserve original's model coding abilities. Specifically, the ratio between c4, math and coding data (see https://bknyaz.github.io/blog/2026/moe/) is 0.0, 0.7, 0.3.
|
| 20 |
The calibration data used here is the same as in our [Qwen3-Coder-Next-REAM](https://huggingface.co/SamsungSAILMontreal/Qwen3-Coder-Next-REAM).**
|
| 21 |
|
|
|
|
| 15 |
This model is a compressed version of [Qwen/Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next).
|
| 16 |
It is obtained by reducing the number of experts in each MoE layer from 512 to 384 using the REAP baseline method as described in https://bknyaz.github.io/blog/2026/moe/.
|
| 17 |
|
| 18 |
+
**Compared to other models obtained in this collection, more coding data is used in the calibration data during pruning/merging
|
| 19 |
to better preserve original's model coding abilities. Specifically, the ratio between c4, math and coding data (see https://bknyaz.github.io/blog/2026/moe/) is 0.0, 0.7, 0.3.
|
| 20 |
The calibration data used here is the same as in our [Qwen3-Coder-Next-REAM](https://huggingface.co/SamsungSAILMontreal/Qwen3-Coder-Next-REAM).**
|
| 21 |
|