Overview

Z-Image-Turbo-MNN-int4 is an MNN diffusion resource package for Z-Image-Turbo, to be used with the MNN C++ diffusion_demo binary (model_type=2 / STABLE_DIFFUSION_ZIMAGE) for text-to-image inference.

What’s included (deduped)

text_encoder.mnn / text_encoder.mnn.weight
unet.mnn / unet.mnn.weight (denoiser/transformer; diffusion_demo uses the unet name)
vae_encoder.mnn / vae_decoder.mnn
tokenizer.txt
config.json: resource description (filenames, precision labels, default inference parameters, etc.)
configuration.json: generic task description (used by some loaders)
scheduler_config.json: FlowMatchEulerDiscreteScheduler config used by ZImage

The directory name usually denotes the (text_encoder_bits, unet_bits, vae_bits) weight-only quant bit-width. Bit-width is set via MNNConvert --weightQuantBits. Treat config.json as the source of truth.

How to export / generate

Export ONNX
- text_encoder.onnx, unet.onnx, vae_encoder.onnx, vae_decoder.onnx
- IO names expected by the MNN ZImage pipeline:
  - text_encoder: inputs input_ids, attention_mask; output last_hidden_state
  - unet: inputs sample, timestep (float), encoder_hidden_states; output out_sample
  - vae_decoder: input latent_sample; output sample

Convert ONNX → MNN (example 4/4/8):

MNNConvert -f ONNX --modelFile text_encoder.onnx --MNNModel text_encoder.mnn --weightQuantBits 4
MNNConvert -f ONNX --modelFile unet.onnx        --MNNModel unet.mnn        --weightQuantBits 4
MNNConvert -f ONNX --modelFile vae_encoder.onnx --MNNModel vae_encoder.mnn --weightQuantBits 8
MNNConvert -f ONNX --modelFile vae_decoder.onnx --MNNModel vae_decoder.mnn --weightQuantBits 8

How to run

Windows (PowerShell / cmd) — `z-image.bat`

.\z-image.bat "a cute cat" 1

Arguments passed to diffusion_demo.exe (order):
resource_path model_type memory_mode backend_type steps seed output_path size cfg gpu_mem_mode precision_mode te_on_cpu prompt_text

Script parameters (modifiable at top of z-image.bat):

MODEL_DIR (default .): model directory (relative or absolute)
MEMORY_TYPE (0/1/2): Diffusion memory mode
BACKEND (0=cpu, 3=opencl, 7=vulkan)
STEPS: diffusion steps (3–12 recommended)
SEED: 0=auto random; non-0=fixed seed
SIZE: 512/640/768/896/1024
CFG: classifier-free guidance scale (typ. 0.5–2.0; default 1.25)
GPU_MEM_MODE (OpenCL only): 0=auto, 1=buffer, 2=image
PRECISION (0=auto, 1=FP16, 2=FP32 normal, 3=FP32 high)
TE_ON_CPU (0 same as UNet, 1 forces text_encoder on CPU)

Linux/macOS — `z-image.sh`

./z-image.sh "a cute cat" 1

Same argument order to diffusion_demo as above. Script parameters mirror the Windows script:

MODEL_DIR
MEMORY_TYPE
BACKEND
STEPS
SEED
SIZE
CFG
GPU_MEM_MODE
PRECISION
TE_ON_CPU

Notes

Weight files (.mnn.weight) are large; use Git LFS/external storage when sharing.
For Z-Image support in MNN, refer to the fork: https://github.com/er6y/MNN
Ensure MNN is built with diffusion enabled (-DMNN_BUILD_DIFFUSION=ON); for ZImage tokenizer, MNN_BUILD_LLM is also required.

Downloads last month: 11

Model tree for er6y/Z-Image-Turbo-MNN-int4

Base model

Tongyi-MAI/Z-Image-Turbo

Finetuned

inclusionAI/TwinFlow-Z-Image-Turbo