llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
)Qwen3-Coder-30B-A3B-Instruct ยท Q4_K_M GGUF
This is a Q4_K_M GGUF quantization of Qwen/Qwen3-Coder-30B-A3B-Instruct, produced from the f16 base.
| Property | Value |
|---|---|
| Base model | Qwen/Qwen3-Coder-30B-A3B-Instruct |
| Quantization | Q4_K_M |
| Format | GGUF |
| Parameters | 30B (MoE, ~3B active) |
About the base model
Qwen3-Coder-30B-A3B-Instruct is a Mixture-of-Experts (MoE) code-focused instruction model developed by Qwen Team, Alibaba Cloud. It features 30B total parameters with ~3B active parameters per token.
For full details, see the original model page.
Usage
llama.cpp
llama-cli \
-m Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf \
--chat-template qwen3 \
-p "Write a Python function that sorts a list of dictionaries by a given key." \
-n 512
llama-server
llama-server \
-m Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf \
--chat-template qwen3 \
--port 8080
Ollama (via Modelfile)
FROM ./Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf
PARAMETER num_ctx 32768
TEMPLATE "{{ ... }}" # use Qwen3 chat template
Quantization details
| File | Quant | Size (approx.) |
|---|---|---|
Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf |
Q4_K_M | ~17 GB |
Q4_K_M uses 4-bit quantization with K-quant method on most layers, providing a good balance between size and quality.
License
This quantized model is derived from Qwen/Qwen3-Coder-30B-A3B-Instruct and is released under the same Apache 2.0 License.
Per Qwen's terms, appropriate credit is given to the original authors:
Qwen3-Coder-30B-A3B-Instruct is developed by Qwen Team, Alibaba Cloud. Original model: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct
Citation
@misc{qwen3coder,
title = {Qwen3-Coder},
author = {Qwen Team},
year = {2025},
organization = {Alibaba Cloud},
url = {https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct}
}
- Downloads last month
- 162
4-bit
Model tree for Zoed/Qwen3-Coder-30B-A3B-Instruct
Base model
Qwen/Qwen3-Coder-30B-A3B-Instruct
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Zoed/Qwen3-Coder-30B-A3B-Instruct", filename="Qwen3-Coder-30B-A3B-Instruct-f16-Q4_K_M.gguf", )