Zen5 Coder

Code-specialized member of the Zen5 family. 80B-parameter sparse MoE tuned for repo-scale code understanding, agentic refactoring, and tool-use coding loops.

Part of the canonical Zen5 ladder:

SKU Hardware fit This repo
zen5-flash anything (4 GB VRAM) zen-5-flash-gguf
zen5-mini 32 GB zen-5-mini-gguf
zen5 (default) 24 GB+ VRAM (Q4_K) zen-5-gguf
zen5-coder 48 GB+ VRAM (Q4_K_M) ← you are here
zen5-pro Mac M4 Max / DGX Spark / H100 80GB zen-5-pro-gguf
zen5-max Mac Studio M3 Ultra 512GB / 8x H100 zen-5-max-gguf

Weights

A first-party zenlm GGUF mirror is staged for this repo. Until it lands, the recommended path is to use the hosted zen5-coder endpoint (see below) or pull a community 80B-class coder GGUF Q4_K_M into a local gguf/ directory.

Run

Hosted via the Hanzo gateway (api.hanzo.ai) as zen5-coder — preferred until the first-party GGUF mirror lands.

Local with llama.cpp or a compatible runtime, once you have a GGUF in gguf/:

MAIN=$(ls gguf/*Q4_K_M*.gguf | head -1)
llama-cli -m "$MAIN" -p "Refactor this Python function to use async/await."

Acknowledgements

Built on Qwen/Qwen3-Coder-Next (Apache-2.0, 80B sparse MoE). Refusal-direction-orthogonalized weights from huihui-ai. GGUF mirror by ymsf. Mirrored here for the Zen5 canonical distribution when storage permits.

Downloads last month
-
Safetensors
Model size
80B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zenlm/zen-5-coder-gguf

Finetuned
(35)
this model

Collection including zenlm/zen-5-coder-gguf