Instructions to use zenlm/zen-designer-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use zenlm/zen-designer-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="zenlm/zen-designer-gguf", filename="GGUF/Q4_K_M/Q4_K_M-00001-of-00003.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use zenlm/zen-designer-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf zenlm/zen-designer-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf zenlm/zen-designer-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf zenlm/zen-designer-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf zenlm/zen-designer-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf zenlm/zen-designer-gguf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf zenlm/zen-designer-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf zenlm/zen-designer-gguf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf zenlm/zen-designer-gguf:Q4_K_M
Use Docker
docker model run hf.co/zenlm/zen-designer-gguf:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use zenlm/zen-designer-gguf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zenlm/zen-designer-gguf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zenlm/zen-designer-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/zenlm/zen-designer-gguf:Q4_K_M
- Ollama
How to use zenlm/zen-designer-gguf with Ollama:
ollama run hf.co/zenlm/zen-designer-gguf:Q4_K_M
- Unsloth Studio
How to use zenlm/zen-designer-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for zenlm/zen-designer-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for zenlm/zen-designer-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for zenlm/zen-designer-gguf to start chatting
- Pi
How to use zenlm/zen-designer-gguf with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf zenlm/zen-designer-gguf:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "zenlm/zen-designer-gguf:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use zenlm/zen-designer-gguf with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf zenlm/zen-designer-gguf:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default zenlm/zen-designer-gguf:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use zenlm/zen-designer-gguf with Docker Model Runner:
docker model run hf.co/zenlm/zen-designer-gguf:Q4_K_M
- Lemonade
How to use zenlm/zen-designer-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull zenlm/zen-designer-gguf:Q4_K_M
Run and chat with the model
lemonade run user.zen-designer-gguf-Q4_K_M
List all available models
lemonade list
Zen Designer GGUF: 235B Vision-Language Model (Abliterated)
235B MoE | Vision-Language | GGUF Quantized | Abliterated
GGUF quantized and abliterated version of Zen Designer — the 235B flagship vision-language model from Zen LM. Supports images, video, documents, charts, GUIs, and spatial reasoning with 256K context.
Model Specifications
| Attribute | Value |
|---|---|
| Parameters | 235B total / 22B active (MoE) |
| Architecture | Vision-language transformer (Mixture of Experts) |
| Context Window | 256K tokens |
| Modalities | Text, Images, Video, Documents |
| OCR Languages | 32 scripts |
| License | Apache 2.0 |
Available Formats
| Format | Size | Description | Recommended Use |
|---|---|---|---|
| Q2_K (split) | ~60 GB | 2-bit quantization, 15-part split | Servers with 64+ GB RAM, maximum scale |
| Q4_K_M | ~142 GB | 4-bit quantization, single or split | Best quality/size tradeoff for local inference |
Quick Start
llama.cpp
# Download a split (Q2_K example — replace with Q4_K_M filename as appropriate)
# Then run:
llama-cli \
--model zen-designer-235b-a22b-instruct-abliterated-Q2_K-00001-of-00015.gguf \
--mmproj mmproj-zen-designer-235b-a22b-instruct-abliterated-f16.gguf \
--image your_image.jpg \
--prompt "Describe this image in detail." \
-n 1024 \
--ctx-size 8192 \
--temp 0.7
For multi-part files, place all split parts in the same directory and point --model to part 00001.
Vision Tasks
Zen Designer handles a broad range of visual inputs:
- Image analysis and description
- Document and PDF parsing
- Chart and table extraction
- GUI navigation and screen understanding
- Video understanding with temporal reasoning
- Bounding box and spatial grounding
Abliteration
This model has been abliterated — a technique that removes refusal behaviors encoded in the model weights without fine-tuning. The process works by identifying the refusal direction in the model's residual stream and projecting it out of the weight matrices.
What abliteration does:
- Removes hardcoded refusal responses
- Preserves all other capabilities and knowledge
- Does not alter factual knowledge or reasoning ability
What abliteration does not do:
- Add harmful knowledge the base model lacked
- Guarantee any specific behavior
- Replace a system prompt or application-level safety policy
Users are responsible for appropriate deployment and use of abliterated models. Apply system prompts and application-layer controls to define model behavior for your use case.
Model Family
| Model | Format | Parameters | Context |
|---|---|---|---|
| zen-designer-235b-a22b-instruct | SafeTensors | 235B / 22B active | 256K |
| zen-designer-gguf | GGUF | 235B / 22B active | 256K |
Links
Zen LM | Hanzo AI | GitHub | All Models
Part of the Zen model family (zenlm.org) by Hanzo AI (Techstars '17) and Zoo Labs Foundation (zoo.ngo).
- Downloads last month
- 189
4-bit