Overview

Z-Image-Turbo-MNN-int4 is an MNN diffusion resource package for Z-Image-Turbo, to be used with the MNN C++ diffusion_demo binary (model_type=2 / STABLE_DIFFUSION_ZIMAGE) for text-to-image inference.

What’s included (deduped)

  • text_encoder.mnn / text_encoder.mnn.weight
  • unet.mnn / unet.mnn.weight (denoiser/transformer; diffusion_demo uses the unet name)
  • vae_encoder.mnn / vae_decoder.mnn
  • tokenizer.txt
  • config.json: resource description (filenames, precision labels, default inference parameters, etc.)
  • configuration.json: generic task description (used by some loaders)
  • scheduler_config.json: FlowMatchEulerDiscreteScheduler config used by ZImage

The directory name usually denotes the (text_encoder_bits, unet_bits, vae_bits) weight-only quant bit-width. Bit-width is set via MNNConvert --weightQuantBits. Treat config.json as the source of truth.

How to export / generate

  1. Export ONNX
    • text_encoder.onnx, unet.onnx, vae_encoder.onnx, vae_decoder.onnx
    • IO names expected by the MNN ZImage pipeline:
      • text_encoder: inputs input_ids, attention_mask; output last_hidden_state
      • unet: inputs sample, timestep (float), encoder_hidden_states; output out_sample
      • vae_decoder: input latent_sample; output sample
  2. Convert ONNX → MNN (example 4/4/8):
    MNNConvert -f ONNX --modelFile text_encoder.onnx --MNNModel text_encoder.mnn --weightQuantBits 4
    MNNConvert -f ONNX --modelFile unet.onnx        --MNNModel unet.mnn        --weightQuantBits 4
    MNNConvert -f ONNX --modelFile vae_encoder.onnx --MNNModel vae_encoder.mnn --weightQuantBits 8
    MNNConvert -f ONNX --modelFile vae_decoder.onnx --MNNModel vae_decoder.mnn --weightQuantBits 8
    

How to run

Windows (PowerShell / cmd) — z-image.bat

.\z-image.bat "a cute cat" 1

Arguments passed to diffusion_demo.exe (order):
resource_path model_type memory_mode backend_type steps seed output_path size cfg gpu_mem_mode precision_mode te_on_cpu prompt_text

Script parameters (modifiable at top of z-image.bat):

  • MODEL_DIR (default .): model directory (relative or absolute)
  • MEMORY_TYPE (0/1/2): Diffusion memory mode
  • BACKEND (0=cpu, 3=opencl, 7=vulkan)
  • STEPS: diffusion steps (3–12 recommended)
  • SEED: 0=auto random; non-0=fixed seed
  • SIZE: 512/640/768/896/1024
  • CFG: classifier-free guidance scale (typ. 0.5–2.0; default 1.25)
  • GPU_MEM_MODE (OpenCL only): 0=auto, 1=buffer, 2=image
  • PRECISION (0=auto, 1=FP16, 2=FP32 normal, 3=FP32 high)
  • TE_ON_CPU (0 same as UNet, 1 forces text_encoder on CPU)

Linux/macOS — z-image.sh

./z-image.sh "a cute cat" 1

Same argument order to diffusion_demo as above. Script parameters mirror the Windows script:

  • MODEL_DIR
  • MEMORY_TYPE
  • BACKEND
  • STEPS
  • SEED
  • SIZE
  • CFG
  • GPU_MEM_MODE
  • PRECISION
  • TE_ON_CPU

Notes

  • Weight files (.mnn.weight) are large; use Git LFS/external storage when sharing.
  • For Z-Image support in MNN, refer to the fork: https://github.com/er6y/MNN
  • Ensure MNN is built with diffusion enabled (-DMNN_BUILD_DIFFUSION=ON); for ZImage tokenizer, MNN_BUILD_LLM is also required.
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for er6y/Z-Image-Turbo-MNN-int4

Finetuned
(2)
this model