--- library_name: transformers license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/LICENSE pipeline_tag: text-generation --- # Qwen3-Coder-30B-A3B-Instruct_MXFP4 This checkpoint is a variant of [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct), where expert weights have been quantized to [MXFP4 format](https://huggingface.co/blog/faster-transformers#what-is-mxfp4) similarly to [gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) and [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b). For quantizing weights we used the function `downcast_to_mxfp` from [triton-kernels](https://github.com/triton-lang/triton/blob/main/python/triton_kernels/triton_kernels/numerics_details/mxfp.py). The checkpoint might come with a small drop in accuracy, but has **~68% size reduction** compared to the original BF16 checkpoint. ## Accuracy Comparison | Model | GSM8K (strict-match) | GSM8K (flexible-extract) | |-------|---------------------|--------------------------| | **Qwen3-Coder-30B-A3B-Instruct (BF16)** | 90.67% ± 0.80% | 89.92% ± 0.83% | | **Qwen3-Coder-30B-A3B-Instruct_MXFP4** | 89.76% ± 0.83% | 88.70% ± 0.87% | ## Checkpoint Size | Model | Size | Reduction | |-------|------|-----------| | **Qwen3-Coder-30B-A3B-Instruct (BF16)** | 57 GB | - | | **Qwen3-Coder-30B-A3B-Instruct_MXFP4** | 18 GB | **68% smaller** |