petergilani
/

Qwen3-Coder-Next-8bit-g128

Text Generation

Mixture of Experts

8-bit precision

Model card Files Files and versions

petergilani commited on Feb 13

Commit

a3f5200

·

verified ·

1 Parent(s): bb9b873

Create model card

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -25,15 +25,15 @@ tags:
 <!-- Provide a longer summary of what this model is. -->
-# Updated Evaluation Results (February 13, 2026)
-Comprehensive evaluation results from thorough testing using mlx_lm.evaluate with mmlu_pro (200 questions per domain, num_shots=1, temp=1.0, top_p=0.95, top_k=40, seed=123):
-## Recent g64 vs g128 Comparison (8-bit)
-Based on the most recent comprehensive evaluation, here is the direct comparison between the 8-bit models:
-### Direct Comparison Summary
 | Domain | 8-bit g64 | 8-bit g128 (this model) | Difference |
 |--------|-----------|-------------------------|------------|

 <!-- Provide a longer summary of what this model is. -->
+# Model Card for Qwen3-Coder-Next-8bit-g128
+Quantized Qwen/Qwen3-Coder-Next using mlx-lm to 8-bit with group_size 128 for main weights and fine-grained group_size 64 for MoE weights, with the aim of maximum accuracy for 8-bit quantization.
+## Updated Evaluation Results (February 13, 2026)
+Comprehensive evaluation results from thorough testing using mlx_lm.evaluate with mmlu_pro (200 questions per domain, num_shots=1, temp=1.0, top_p=0.95, top_k=40, seed=123):
+### Direct Comparison Summary (8-bit g64 vs g128)
 | Domain | 8-bit g64 | 8-bit g128 (this model) | Difference |
 |--------|-----------|-------------------------|------------|