Qwen3-Coder-30B-A3B-Instruct-AWQ

Duplication of cpatonn/Qwen3-Coder-30B-A3B-Instruct-AWQ

Method

Quantised using vllm-project/llm-compressor, nvidia/Llama-Nemotron-Post-Training-Dataset and the following configs:

recipe = [
    AWQModifier(
        ignore=["lm_head", "re:.*mlp.gate$", "re:.*mlp.shared_expert_gate$"],
        scheme="W4A16",
        targets=["Linear"],  
    ),
]

Citation

If you find our work helpful, feel free to give us a cite.

@misc{qwen3technicalreport,
      title={Qwen3 Technical Report}, 
      author={Qwen Team},
      year={2025},
      eprint={2505.09388},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.09388}, 
}

Downloads last month: 9

Safetensors

Model size

5B params

Tensor type

I64

I32

BF16

Paper for tabnine/Qwen3-Coder-30B-A3B-Instruct-AWQ

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 340