Qwen3-Coder-Next-mxfp4-mlx

The Qwen3-Coder-Next outperforms the previous Next models with ease.

The mxfp4 is head and shoulders above the old Next q8, establishing itself as the highest performing quant so far.

Brainwaves

         arc   arc/e boolq hswag obkqa piqa  wino
qx86n-hi 0.518,0.710,0.882,0.626,0.416,0.745,0.601
qx86n    0.515,0.712,0.881,0.627,0.414,0.744,0.590
mxfp8    0.514,0.709,0.884,0.639,0.420,0.748,0.611
mxfp4    0.528,0.713,0.880,0.630,0.428,0.744,0.619
qx64n-hi 0.527,0.707,0.880,0.631,0.426,0.744,0.580
qx64n    0.511,0.703,0.881,0.631,0.420,0.746,0.598
qx53n    0.520,0.714,0.872,0.630,0.438,0.744,0.599

Qwen3-Next-80B-A3B-Instruct
q8       0.402,0.494,0.896,0.540,0.420,0.754,0.554

Qwen3-Next-80B-A3B-Thinking
q8       0.409,0.459,0.648,0.655,0.376,0.783,0.692

         Size  Perplexity
qx86n-hi 82G   4.484 ± 0.033
qx86n    73G   4.487 ± 0.033
mxfp8    82G   4.537 ± 0.033
mxfp4    42G   4.676 ± 0.035
qx64n-hi 54G   4.528 ± 0.033
qx64n    53G   4.525 ± 0.033
qx53n    43G   4.750 ± 0.036

The Deckard(qx) formula for Next was used without changes from the previous Next series.

There are very few benefits from using group size 32 in high quants for Coder-Next, as it did not bring much to qx86n-hi(not being uploaded for space constraints).

At low quants however, the mxfp4 and qx64n-hi show highest combined arc, openbookqa, hellaswag and winogrande, even compared to larger quants.

The qx53n still works, has a bit better openbooka than the mxfp4, sufficient to matter in some use cases.

The mxfp8 seems unbeatable in speed, and its metrics are excellent, showing highest logic, hellaswag, and piqa.


Abliterated and REAP models

I benchmarked the mxfp4/mxfp8 for being the most stable quants with very little loss from full precision.

Qwen3-Coder-Next
mxfp8  0.514,0.709,0.884,0.639,0.420,0.748,0.611

Huihui-Qwen3-Coder-Next-abliterated
mxfp8  0.488,0.681,0.871,0.628,0.404,0.753,0.581

lovedheart/Qwen3-Coder-Next-REAP-40B-A3B
mxfp8  0.390,0.508,0.610,0.532,0.354,0.665,0.577

Perplexity

Huihui    mxfp8  4.817 ± 0.036
Huihui    mxfp4  4.946 ± 0.037

REAP-40B  mxfp8  11.127 ± 0.103
REAP-40B  mxfp4  11.479 ± 0.107

REAP-48B  mxfp8  9.489 ± 0.085
REAP-48B  mxfp4  9.676 ± 0.087

The REAP models seem much more cheerful than the original, but lose a lot of arc and boolq that shows as heavy hallucinations in the output.


Nightmedia models

Here are some Brainwaves for qx86-hi for Nightmedia 30B-A3B elements, to give an idea how much better Next could be

These tests have nothing to do with what the model knows, but how well it thinks with what it knows.

           arc   arc/e boolq hswag obkqa piqa  wino
Element4   0.514,0.617,0.846,0.769,0.442,0.801,0.731
Element5   0.560,0.709,0.883,0.756,0.448,0.807,0.713
Element6   0.568,0.737,0.880,0.760,0.450,0.803,0.714
Element7   0.578,0.750,0.883,0.742,0.478,0.804,0.684

So cognitively, the Next has risen up to Element5 level or so.

And then there is the curious case of the Qwen3-4B-Engineer3x-qx86-hi-mlx

           arc   arc/e boolq hswag obkqa piqa  wino
qx86-hi    0.615,0.835,0.852,0.745,0.420,0.780,0.704

We have other models in this range :)

-G

This model Qwen3-Coder-Next-mxfp4-mlx was converted to MLX format from Qwen/Qwen3-Coder-Next using mlx-lm version 0.30.6.

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Qwen3-Coder-Next-mxfp4-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, return_dict=False,
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
Downloads last month
3,281
Safetensors
Model size
80B params
Tensor type
U8
·
U32
·
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nightmedia/Qwen3-Coder-Next-mxfp4-mlx

Quantized
(66)
this model

Collections including nightmedia/Qwen3-Coder-Next-mxfp4-mlx