How to use this mlx model, Z-Image-Turbo-MLX-8bit? Is there any sample code available?

#1
by kavinbj - opened

Is there any sample code available? if I want to add a LoRa model, how to use this mlx model?

Hi! This model is natively supported by mflux. Install with pip install mflux.

CLI

mflux-generate-z-image-turbo \
  --model andrevp/Z-Image-Turbo-MLX-8bit \
  --prompt "A puffin standing on a cliff" \
  --width 1024 --height 1024 \
  --steps 9 --seed 42

Python API

from mflux.models.z_image import ZImage
from mflux.models.common.config import ModelConfig

model = ZImage(
    model_config=ModelConfig.z_image_turbo(),
    quantize=8,
    model_path="andrevp/Z-Image-Turbo-MLX-8bit",
)

image = model.generate_image(
    seed=42,
    prompt="A puffin standing on a cliff",
    num_inference_steps=9,
    height=1024,
    width=1024,
)
image.save("output.png")

With LoRA

mflux-generate-z-image-turbo \
  --model andrevp/Z-Image-Turbo-MLX-8bit \
  --prompt "A woman in illustration style" \
  --width 1024 --height 1024 \
  --steps 9 --seed 42 \
  --lora-paths your-org/your-lora \
  --lora-scales 0.8

Or in Python:

model = ZImage(
    model_config=ModelConfig.z_image_turbo(),
    quantize=8,
    model_path="andrevp/Z-Image-Turbo-MLX-8bit",
    lora_paths=["your-org/your-lora"],
    lora_scales=[0.8],
)

LoRA paths can be local files, HuggingFace repos (e.g. renderartist/Technically-Color-Z-Image-Turbo), or collection format (repo:filename.safetensors). You can stack multiple LoRAs by passing multiple paths and scales.

For training your own LoRA, see this guide.

thanks πŸ‘Œ

I try CLI Code:

mflux-generate-z-image-turbo
--model andrevp/Z-Image-Turbo-MLX \
--prompt "A puffin standing on a cliff"
--width 1024 --height 1024
--steps 9 --seed 42

this code work correctly.

but for 8-bit CLI Code:
mflux-generate-z-image-turbo
--model andrevp/Z-Image-Turbo-MLX-8bit
--prompt "A woman in illustration style"
--width 1024 --height 1024
--steps 9 --seed 42 --quantize 8

will show error:
Traceback (most recent call last):
File "/Users/mac/kuang/slides-video/.venv/bin/mflux-generate-z-image-turbo", line 6, in
sys.exit(main())
~~~~^^
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/cli/z_image_turbo_generate.py", line 48, in main
image = model.generate_image(
seed=seed,
...<8 lines>...
negative_prompt=args.negative_prompt,
)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/variants/z_image.py", line 92, in generate_image
text_encodings, negative_encodings = self._encode_prompts(
~~~~~~~~~~~~~~~~~~~~^
prompt=prompt,
^^^^^^^^^^^^^^
negative_prompt=negative_prompt,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
guidance=config.guidance,
^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/variants/z_image.py", line 158, in _encode_prompts
text_encodings = PromptEncoder.encode_prompt(
prompt=prompt,
tokenizer=self.tokenizers["z_image"],
text_encoder=self.text_encoder,
)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/prompt_encoder.py", line 15, in encode_prompt
cap_feats = text_encoder(output.input_ids, output.attention_mask)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/text_encoder.py", line 51, in call
hidden_states = layer(
hidden_states=hidden_states,
attention_mask=causal_mask,
position_embeddings=position_embeddings,
)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/encoder_layer.py", line 31, in call
hidden_states = self.self_attn(self.input_layernorm(hidden_states), attention_mask, position_embeddings)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/attention.py", line 36, in call
q = self.q_proj(hidden_states).reshape(batch_size, seq_len, self.num_heads, self.head_dim)
~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mlx/nn/layers/quantized.py", line 266, in call
x = mx.quantized_matmul(
x,
...<6 lines>...
mode=self.mode,
)
ValueError: [quantized_matmul] Last dimension of first input with shape (..., 2560) does not match the expanded quantized matrix (640, 4096) computed from shape (4096,160) with group_size=64, bits=8 and transpose=true

Sign up or log in to comment