--- license: apache-2.0 base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct base_model_relation: quantized language: - en tags: - qwen3 - qwen3-coder - code - gguf - quantized - q4_k_m pipeline_tag: text-generation --- # Qwen3-Coder-30B-A3B-Instruct ยท Q4_K_M GGUF This is a **Q4_K_M GGUF quantization** of [Qwen/Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct), produced from the f16 base. | Property | Value | |---|---| | Base model | Qwen/Qwen3-Coder-30B-A3B-Instruct | | Quantization | Q4_K_M | | Format | GGUF | | Parameters | 30B (MoE, ~3B active) | ## About the base model Qwen3-Coder-30B-A3B-Instruct is a Mixture-of-Experts (MoE) code-focused instruction model developed by [Qwen Team, Alibaba Cloud](https://qwenlm.github.io/). It features 30B total parameters with ~3B active parameters per token. For full details, see the [original model page](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct). ## Usage ### llama.cpp ```bash llama-cli \ -m Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf \ --chat-template qwen3 \ -p "Write a Python function that sorts a list of dictionaries by a given key." \ -n 512 ``` ### llama-server ```bash llama-server \ -m Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf \ --chat-template qwen3 \ --port 8080 ``` ### Ollama (via Modelfile) ``` FROM ./Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf PARAMETER num_ctx 32768 TEMPLATE "{{ ... }}" # use Qwen3 chat template ``` ## Quantization details | File | Quant | Size (approx.) | |---|---|---| | `Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf` | Q4_K_M | ~17 GB | **Q4_K_M** uses 4-bit quantization with K-quant method on most layers, providing a good balance between size and quality. ## License This quantized model is derived from [Qwen/Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) and is released under the same [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). Per Qwen's terms, appropriate credit is given to the original authors: > Qwen3-Coder-30B-A3B-Instruct is developed by Qwen Team, Alibaba Cloud. > Original model: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct ## Citation ```bibtex @misc{qwen3coder, title = {Qwen3-Coder}, author = {Qwen Team}, year = {2025}, organization = {Alibaba Cloud}, url = {https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct} } ```