Instructions to use BennyDaBall/Z-Image-Engineer-V6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use BennyDaBall/Z-Image-Engineer-V6 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="BennyDaBall/Z-Image-Engineer-V6")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("BennyDaBall/Z-Image-Engineer-V6")
model = AutoModelForCausalLM.from_pretrained("BennyDaBall/Z-Image-Engineer-V6")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use BennyDaBall/Z-Image-Engineer-V6 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "BennyDaBall/Z-Image-Engineer-V6"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BennyDaBall/Z-Image-Engineer-V6",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/BennyDaBall/Z-Image-Engineer-V6

SGLang

How to use BennyDaBall/Z-Image-Engineer-V6 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "BennyDaBall/Z-Image-Engineer-V6" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BennyDaBall/Z-Image-Engineer-V6",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "BennyDaBall/Z-Image-Engineer-V6" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BennyDaBall/Z-Image-Engineer-V6",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use BennyDaBall/Z-Image-Engineer-V6 with Docker Model Runner:
```
docker model run hf.co/BennyDaBall/Z-Image-Engineer-V6
```

BennyDaBall commited on 3 days ago

Commit

496b168

verified ·

1 Parent(s): 9aee4d8

Add files using upload-large-folder tool

Browse files

Files changed (3) hide show

HASHES.sha256 +1 -8
README.md +17 -23
RELEASE_MANIFEST.json +6 -102

HASHES.sha256 CHANGED Viewed

@@ -6,14 +6,7 @@ D8AD1C43FD8C76F1EBDAD11C85D494A474F9CFE9F83AF5F72F590AE3852315A2  evidence/galle
 C4D12692AE5CEFA9B7E61C2A581062F6B4B06183165EB2BCEEE9E11F26B82308  model-00001-of-00003.safetensors
 93A4CAF2F35B815178DB5CE43C9FD5E06E3EF836F5CBEEE7C690961D95DA653B  model-00002-of-00003.safetensors
 4B3EF3D52BCAD649213FD2035D94DF48CBF2FD670250EC1FD8E1748072ECDBF2  model-00003-of-00003.safetensors
-BF31DA5A1F64F1D7F9AF7C692C82296ED9B2AD59076588BF297E446D6FF54C1C  README.md
 BE75606093DB2094D7CD20F3C2F385C212750648BD6EA4FB2BF507A6A4C55506  tokenizer.json
 154E5FF1E7C152D964EDF30DA854EA62465C767719AC8E97E58BABF2D4FA9079  tokenizer_config.json
 34126E2486E389F28C11693C2E51641199FB5B53E3E7D6BFA75A6E967C11D3CF  V6_SYSTEM_PROMPT.md
-20DAB6305B76B28808FAD740C7107878DEEC63688E1B318F7BB3A7F707220B0D  Z-Image-Engineer-V6-F16.gguf
-A39695B6714FC4A0A86965F5B2FB8B0CBEF774165EEC8FB9B2379FBEDD86838A  Z-Image-Engineer-V6-MXFP4.gguf
-E3F493D971677BA181F67C888AD41E25FD34448BF7EEA03A84F4114EE021B9E3  Z-Image-Engineer-V6-Q3_K_M.gguf
-D666E619EDB2D6DCF2DF013540B22E2592C4FBADB9007B3FB89D4BBE0C4C7C67  Z-Image-Engineer-V6-Q4_K_M.gguf
-0FAB79F032AA34BAAC8607FF8BA720DFB95A0D9A44026DE79288F3FD25A66A05  Z-Image-Engineer-V6-Q5_K_M.gguf
-A27D6723816462EA1368093A76E9013E996BD4B731EF87327334D50D6DD9534C  Z-Image-Engineer-V6-Q6_K.gguf
-DC4F5476A0F804A7DB73EDA164C0503CDA93858F3EABDE9EA36C68EEDCBA306C  Z-Image-Engineer-V6-Q8_0.gguf

 C4D12692AE5CEFA9B7E61C2A581062F6B4B06183165EB2BCEEE9E11F26B82308  model-00001-of-00003.safetensors
 93A4CAF2F35B815178DB5CE43C9FD5E06E3EF836F5CBEEE7C690961D95DA653B  model-00002-of-00003.safetensors
 4B3EF3D52BCAD649213FD2035D94DF48CBF2FD670250EC1FD8E1748072ECDBF2  model-00003-of-00003.safetensors
+AF0D74388A53EF4A9B37CF98B05922F1D5D3A888C1C6BBB95D883D89AC423760  README.md
 BE75606093DB2094D7CD20F3C2F385C212750648BD6EA4FB2BF507A6A4C55506  tokenizer.json
 154E5FF1E7C152D964EDF30DA854EA62465C767719AC8E97E58BABF2D4FA9079  tokenizer_config.json
 34126E2486E389F28C11693C2E51641199FB5B53E3E7D6BFA75A6E967C11D3CF  V6_SYSTEM_PROMPT.md

README.md CHANGED Viewed

@@ -13,7 +13,6 @@ tags:
   - z-image
   - z-image-turbo
   - qwen3
-  - gguf
   - text-encoder
   - comfyui
   - lm-studio
@@ -31,7 +30,7 @@ tags:
 | **Base Model** | `Tongyi-MAI/Z-Image-Turbo` |
 | **Library** | `transformers` |
 | **Pipeline Tag** | `text-generation` |
-| **Format** | GGUF, HF Safetensors |
 ---
@@ -39,7 +38,7 @@ The **Z-Engineer** returns, fully rebuilt around the **SMART DoRA** training sys
 Yes, we jump from V4 to V6. Unlike the usual guy math, this one actually brought the extra two inches.
-**Z-Image-Engineer V6** is a fine-tuned 4B Qwen text encoder (`Tongyi-MAI/Z-Image-Turbo`) optimized for dual-role performance: a local prompt-enhancement model for LM Studio, and a direct drop-in replacement text encoder for ComfyUI.
 ![Z-Image-Engineer V6 simple A/B with rewrites](evidence/gallery_z_image_engineer_v6_simple_ab_with_rewrites_CONTACT.png)
@@ -49,12 +48,12 @@ Yes, we jump from V4 to V6. Unlike the usual guy math, this one actually brought
 V6 transforms minimal seed prompts into rich, highly structured visual narratives. It adds explicit scene composition, lighting direction, material texture, and depth separation while stripping out empty prompt sludge like *"8k, masterpiece, trending on ArtStation."*
-It can also be used directly as a Z-Image text encoder. Drop the GGUF into ComfyUI, load it with `CLIPLoaderGGUF`, set the type to `lumina2`, and compare it against the stock `qwen_3_4b.safetensors`.
 ### Key Use Cases
 - **Prompt Enhancement:** Upgrade simple concepts into descriptive, high-fidelity visual prompts locally.
-- **Text Encoder Swap:** Replace the stock Z-Image Qwen text encoder in ComfyUI to generate different conditioning from the same seed.
 - **Hybrid Mode:** Use V6 to rewrite your prompt, then use V6 again to encode it. It writes the scene and drives the image model.
 - **Private Local Workflow:** Built for LM Studio, ComfyUI, and `llama.cpp`. No API logs, no external telemetry.
@@ -89,7 +88,7 @@ V6 was not a simple one-and-done training run. The final architecture is a blend
 ### LM Studio: Prompt Enhancement
-Download your preferred GGUF quant, load the model, and prompt it directly. No complex system prompt is required.
 ```text
 Enhance this image prompt for Z-Image Turbo: a unicorn
@@ -99,10 +98,11 @@ The comparison examples were generated from direct LM Studio user requests like
 ### ComfyUI: Direct Encoder Swap
-1. Place the GGUF file into `ComfyUI/models/text_encoders/`.
-2. Add a `CLIPLoaderGGUF` node.
-3. Set model type to `lumina2`.
-4. Use it where the stock Z-Image Qwen text encoder would normally go.
 Optional workflow repo:
@@ -115,7 +115,7 @@ The raw GGUF works without the node.
 ```text
 UNET: z_image_turbo_bf16.safetensors
 VAE: ae.safetensors
-Text Encoder: Z-Image-Engineer-V6-Q8_0.gguf
 Resolution: 1024x1024
 Steps: 8
 CFG: 1.0
@@ -136,23 +136,17 @@ Shift: 3.0
 | **Rank / Alpha / Dropout** | 64 / 64 / 0.03 |
 | **Target Modules** | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `down_proj`, `up_proj` |
 | **Refinement Stack** | Supervised Style SFT + Binary Anti-Repeat |
-| **Final Packaging** | Merged HF safetensors + full GGUF ladder |
 ---
 ## GGUF Quantization Ladder
-All weights are locally hashed. Full recursive validation hashes are in `HASHES.sha256`.
-| Filename | Size | Target Use Case |
-|---|---:|---|
-| `Z-Image-Engineer-V6-F16.gguf` | 7.498 GiB | Full precision reference. |
-| `Z-Image-Engineer-V6-Q8_0.gguf` | 3.986 GiB | Near-lossless; used for local A/B testing. |
-| `Z-Image-Engineer-V6-Q6_K.gguf` | 3.079 GiB | High-fidelity balanced footprint. |
-| `Z-Image-Engineer-V6-Q5_K_M.gguf` | 2.697 GiB | Daily-driver performance-to-size ratio. |
-| `Z-Image-Engineer-V6-Q4_K_M.gguf` | 2.331 GiB | Reliable 4-bit standard. |
-| `Z-Image-Engineer-V6-Q3_K_M.gguf` | 1.933 GiB | Lightweight option for tighter setups. |
-| `Z-Image-Engineer-V6-MXFP4.gguf` | 2.101 GiB | Alternative compact quantization. |
 ---

   - z-image
   - z-image-turbo
   - qwen3
   - text-encoder
   - comfyui
   - lm-studio
 | **Base Model** | `Tongyi-MAI/Z-Image-Turbo` |
 | **Library** | `transformers` |
 | **Pipeline Tag** | `text-generation` |
+| **Format** | HF Safetensors |
 ---
 Yes, we jump from V4 to V6. Unlike the usual guy math, this one actually brought the extra two inches.
+**Z-Image-Engineer V6** is a fine-tuned 4B Qwen text encoder (`Tongyi-MAI/Z-Image-Turbo`) optimized for dual-role performance: a local prompt-enhancement model for LM Studio, and a merged HF text encoder for Z-Image workflows.
 ![Z-Image-Engineer V6 simple A/B with rewrites](evidence/gallery_z_image_engineer_v6_simple_ab_with_rewrites_CONTACT.png)
 V6 transforms minimal seed prompts into rich, highly structured visual narratives. It adds explicit scene composition, lighting direction, material texture, and depth separation while stripping out empty prompt sludge like *"8k, masterpiece, trending on ArtStation."*
+It can also be used directly as a Z-Image text encoder. This repo contains the merged HF safetensors. The GGUF quantized release lives in the companion repo: [Z-Image-Engineer-V6-GGUF](https://huggingface.co/BennyDaBall/Z-Image-Engineer-V6-GGUF).
 ### Key Use Cases
 - **Prompt Enhancement:** Upgrade simple concepts into descriptive, high-fidelity visual prompts locally.
+- **Text Encoder Swap:** Replace the stock Z-Image Qwen text encoder to generate different conditioning from the same seed.
 - **Hybrid Mode:** Use V6 to rewrite your prompt, then use V6 again to encode it. It writes the scene and drives the image model.
 - **Private Local Workflow:** Built for LM Studio, ComfyUI, and `llama.cpp`. No API logs, no external telemetry.
 ### LM Studio: Prompt Enhancement
+Use this merged HF release directly where supported, or download a GGUF quant from [Z-Image-Engineer-V6-GGUF](https://huggingface.co/BennyDaBall/Z-Image-Engineer-V6-GGUF) for LM Studio. No complex system prompt is required.
 ```text
 Enhance this image prompt for Z-Image Turbo: a unicorn
 ### ComfyUI: Direct Encoder Swap
+1. Download a GGUF quant from [Z-Image-Engineer-V6-GGUF](https://huggingface.co/BennyDaBall/Z-Image-Engineer-V6-GGUF).
+2. Place the GGUF file into `ComfyUI/models/text_encoders/`.
+3. Add a `CLIPLoaderGGUF` node.
+4. Set model type to `lumina2`.
+5. Use it where the stock Z-Image Qwen text encoder would normally go.
 Optional workflow repo:
 ```text
 UNET: z_image_turbo_bf16.safetensors
 VAE: ae.safetensors
+Text Encoder: Z-Image-Engineer-V6-Q8_0.gguf from the GGUF repo
 Resolution: 1024x1024
 Steps: 8
 CFG: 1.0
 | **Rank / Alpha / Dropout** | 64 / 64 / 0.03 |
 | **Target Modules** | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `down_proj`, `up_proj` |
 | **Refinement Stack** | Supervised Style SFT + Binary Anti-Repeat |
+| **Final Packaging** | Merged HF safetensors |
 ---
 ## GGUF Quantization Ladder
+The quantized release is separate on purpose:
+[BennyDaBall/Z-Image-Engineer-V6-GGUF](https://huggingface.co/BennyDaBall/Z-Image-Engineer-V6-GGUF)
+That repo contains the full GGUF ladder: F16, Q8_0, Q6_K, Q5_K_M, Q4_K_M, Q3_K_M, and MXFP4.
 ---

RELEASE_MANIFEST.json CHANGED Viewed

@@ -2,6 +2,7 @@
     "status":  "upload_ready_user_approved",
     "public_model_name":  "Z-Image-Engineer-V6",
     "repo_id":  "BennyDaBall/Z-Image-Engineer-V6",
     "base_model":  "Tongyi-MAI/Z-Image-Turbo/text_encoder",
     "tokenizer":  "Tongyi-MAI/Z-Image-Turbo/tokenizer",
     "files":  [
@@ -63,10 +64,10 @@
                   },
                   {
                       "path":  "README.md",
-                      "size_bytes":  7151,
                       "size_gib":  0,
-                      "sha256":  "BF31DA5A1F64F1D7F9AF7C692C82296ED9B2AD59076588BF297E446D6FF54C1C",
-                      "last_write_time":  "2026-06-06T01:33:41"
                   },
                   {
                       "path":  "tokenizer.json",
@@ -88,107 +89,10 @@
                       "size_gib":  0,
                       "sha256":  "34126E2486E389F28C11693C2E51641199FB5B53E3E7D6BFA75A6E967C11D3CF",
                       "last_write_time":  "2026-05-30T07:33:27"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-F16.gguf",
-                      "size_bytes":  8051284960,
-                      "size_gib":  7.498,
-                      "sha256":  "20DAB6305B76B28808FAD740C7107878DEEC63688E1B318F7BB3A7F707220B0D",
-                      "last_write_time":  "2026-06-05T13:36:01"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-MXFP4.gguf",
-                      "size_bytes":  2256005600,
-                      "size_gib":  2.101,
-                      "sha256":  "A39695B6714FC4A0A86965F5B2FB8B0CBEF774165EEC8FB9B2379FBEDD86838A",
-                      "last_write_time":  "2026-06-05T22:48:03"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q3_K_M.gguf",
-                      "size_bytes":  2075617760,
-                      "size_gib":  1.933,
-                      "sha256":  "E3F493D971677BA181F67C888AD41E25FD34448BF7EEA03A84F4114EE021B9E3",
-                      "last_write_time":  "2026-06-05T22:47:54"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q4_K_M.gguf",
-                      "size_bytes":  2503178720,
-                      "size_gib":  2.331,
-                      "sha256":  "D666E619EDB2D6DCF2DF013540B22E2592C4FBADB9007B3FB89D4BBE0C4C7C67",
-                      "last_write_time":  "2026-06-05T22:47:42"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q5_K_M.gguf",
-                      "size_bytes":  2895780320,
-                      "size_gib":  2.697,
-                      "sha256":  "0FAB79F032AA34BAAC8607FF8BA720DFB95A0D9A44026DE79288F3FD25A66A05",
-                      "last_write_time":  "2026-06-05T22:47:26"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q6_K.gguf",
-                      "size_bytes":  3306260960,
-                      "size_gib":  3.079,
-                      "sha256":  "A27D6723816462EA1368093A76E9013E996BD4B731EF87327334D50D6DD9534C",
-                      "last_write_time":  "2026-06-05T22:47:11"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q8_0.gguf",
-                      "size_bytes":  4280404960,
-                      "size_gib":  3.986,
-                      "sha256":  "DC4F5476A0F804A7DB73EDA164C0503CDA93858F3EABDE9EA36C68EEDCBA306C",
-                      "last_write_time":  "2026-06-05T22:46:50"
                   }
               ],
     "ggufs":  [
-                  {
-                      "path":  "Z-Image-Engineer-V6-F16.gguf",
-                      "size_bytes":  8051284960,
-                      "size_gib":  7.498,
-                      "sha256":  "20DAB6305B76B28808FAD740C7107878DEEC63688E1B318F7BB3A7F707220B0D",
-                      "last_write_time":  "2026-06-05T13:36:01"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-MXFP4.gguf",
-                      "size_bytes":  2256005600,
-                      "size_gib":  2.101,
-                      "sha256":  "A39695B6714FC4A0A86965F5B2FB8B0CBEF774165EEC8FB9B2379FBEDD86838A",
-                      "last_write_time":  "2026-06-05T22:48:03"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q3_K_M.gguf",
-                      "size_bytes":  2075617760,
-                      "size_gib":  1.933,
-                      "sha256":  "E3F493D971677BA181F67C888AD41E25FD34448BF7EEA03A84F4114EE021B9E3",
-                      "last_write_time":  "2026-06-05T22:47:54"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q4_K_M.gguf",
-                      "size_bytes":  2503178720,
-                      "size_gib":  2.331,
-                      "sha256":  "D666E619EDB2D6DCF2DF013540B22E2592C4FBADB9007B3FB89D4BBE0C4C7C67",
-                      "last_write_time":  "2026-06-05T22:47:42"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q5_K_M.gguf",
-                      "size_bytes":  2895780320,
-                      "size_gib":  2.697,
-                      "sha256":  "0FAB79F032AA34BAAC8607FF8BA720DFB95A0D9A44026DE79288F3FD25A66A05",
-                      "last_write_time":  "2026-06-05T22:47:26"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q6_K.gguf",
-                      "size_bytes":  3306260960,
-                      "size_gib":  3.079,
-                      "sha256":  "A27D6723816462EA1368093A76E9013E996BD4B731EF87327334D50D6DD9534C",
-                      "last_write_time":  "2026-06-05T22:47:11"
-                  },
-                  {
-                      "path":  "Z-Image-Engineer-V6-Q8_0.gguf",
-                      "size_bytes":  4280404960,
-                      "size_gib":  3.986,
-                      "sha256":  "DC4F5476A0F804A7DB73EDA164C0503CDA93858F3EABDE9EA36C68EEDCBA306C",
-                      "last_write_time":  "2026-06-05T22:46:50"
-                  }
               ],
     "evidence":  [
                      {
@@ -199,6 +103,6 @@
                          "last_write_time":  "2026-06-05T23:31:26"
                      }
                  ],
-    "generated_at_local":  "2026-06-06T01:35:03",
     "upload_approved_by_user":  true
 }

     "status":  "upload_ready_user_approved",
     "public_model_name":  "Z-Image-Engineer-V6",
     "repo_id":  "BennyDaBall/Z-Image-Engineer-V6",
+    "package_kind":  "merged_hf_safetensors",
     "base_model":  "Tongyi-MAI/Z-Image-Turbo/text_encoder",
     "tokenizer":  "Tongyi-MAI/Z-Image-Turbo/tokenizer",
     "files":  [
                   },
                   {
                       "path":  "README.md",
+                      "size_bytes":  6895,
                       "size_gib":  0,
+                      "sha256":  "AF0D74388A53EF4A9B37CF98B05922F1D5D3A888C1C6BBB95D883D89AC423760",
+                      "last_write_time":  "2026-06-06T01:54:57"
                   },
                   {
                       "path":  "tokenizer.json",
                       "size_gib":  0,
                       "sha256":  "34126E2486E389F28C11693C2E51641199FB5B53E3E7D6BFA75A6E967C11D3CF",
                       "last_write_time":  "2026-05-30T07:33:27"
                   }
               ],
     "ggufs":  [
               ],
     "evidence":  [
                      {
                          "last_write_time":  "2026-06-05T23:31:26"
                      }
                  ],
+    "generated_at_local":  "2026-06-06T01:55:59",
     "upload_approved_by_user":  true
 }