Zen5 Coder
Code-specialized member of the Zen5 family. 80B-parameter sparse MoE tuned for repo-scale code understanding, agentic refactoring, and tool-use coding loops.
Part of the canonical Zen5 ladder:
| SKU | Hardware fit | This repo |
|---|---|---|
zen5-flash |
anything (4 GB VRAM) | zen-5-flash-gguf |
zen5-mini |
32 GB | zen-5-mini-gguf |
zen5 (default) |
24 GB+ VRAM (Q4_K) | zen-5-gguf |
zen5-coder |
48 GB+ VRAM (Q4_K_M) | ← you are here |
zen5-pro |
Mac M4 Max / DGX Spark / H100 80GB | zen-5-pro-gguf |
zen5-max |
Mac Studio M3 Ultra 512GB / 8x H100 | zen-5-max-gguf |
Weights
A first-party zenlm GGUF mirror is staged for this repo. Until it lands, the recommended path is to use the hosted zen5-coder endpoint (see below) or pull a community 80B-class coder GGUF Q4_K_M into a local gguf/ directory.
Run
Hosted via the Hanzo gateway (api.hanzo.ai) as zen5-coder — preferred until the first-party GGUF mirror lands.
Local with llama.cpp or a compatible runtime, once you have a GGUF in gguf/:
MAIN=$(ls gguf/*Q4_K_M*.gguf | head -1)
llama-cli -m "$MAIN" -p "Refactor this Python function to use async/await."
Acknowledgements
Built on Qwen/Qwen3-Coder-Next (Apache-2.0, 80B sparse MoE). Refusal-direction-orthogonalized weights from huihui-ai. GGUF mirror by ymsf. Mirrored here for the Zen5 canonical distribution when storage permits.
- Downloads last month
- -
Model tree for zenlm/zen-5-coder-gguf
Base model
Qwen/Qwen3-Coder-Next