Zen5 Coder

Code-specialized member of the Zen5 family. 80B-parameter sparse MoE tuned for repo-scale code understanding, agentic refactoring, and tool-use coding loops.

Part of the canonical Zen5 ladder:

SKU	Hardware fit	This repo
`zen5-flash`	anything (4 GB VRAM)	zen-5-flash-gguf
`zen5-mini`	32 GB	zen-5-mini-gguf
`zen5` (default)	24 GB+ VRAM (Q4_K)	zen-5-gguf
`zen5-coder`	48 GB+ VRAM (Q4_K_M)	← you are here
`zen5-pro`	Mac M4 Max / DGX Spark / H100 80GB	zen-5-pro-gguf
`zen5-max`	Mac Studio M3 Ultra 512GB / 8x H100	zen-5-max-gguf

Weights

A first-party zenlm GGUF mirror is staged for this repo. Until it lands, the recommended path is to use the hosted zen5-coder endpoint (see below) or pull a community 80B-class coder GGUF Q4_K_M into a local gguf/ directory.

Run

Hosted via the Hanzo gateway (api.hanzo.ai) as zen5-coder — preferred until the first-party GGUF mirror lands.

Local with llama.cpp or a compatible runtime, once you have a GGUF in gguf/:

MAIN=$(ls gguf/*Q4_K_M*.gguf | head -1)
llama-cli -m "$MAIN" -p "Refactor this Python function to use async/await."

Acknowledgements

Built on Qwen/Qwen3-Coder-Next (Apache-2.0, 80B sparse MoE). Refusal-direction-orthogonalized weights from huihui-ai. GGUF mirror by ymsf. Mirrored here for the Zen5 canonical distribution when storage permits.

Downloads last month: -

Safetensors

Model size

80B params

Tensor type

BF16

Model tree for zenlm/zen-5-coder-gguf

Base model

Qwen/Qwen3-Coder-Next

Finetuned

(35)

this model

Collection including zenlm/zen-5-coder-gguf

Zen5 Chat Ladder

Collection

Canonical Zen5 lineup, smallest to largest. • 6 items • Updated about 11 hours ago