How to use this mlx model, Z-Image-Turbo-MLX-8bit? Is there any sample code available?
Is there any sample code available? if I want to add a LoRa model, how to use this mlx model?
Hi! This model is natively supported by mflux. Install with pip install mflux.
CLI
mflux-generate-z-image-turbo \
--model andrevp/Z-Image-Turbo-MLX-8bit \
--prompt "A puffin standing on a cliff" \
--width 1024 --height 1024 \
--steps 9 --seed 42
Python API
from mflux.models.z_image import ZImage
from mflux.models.common.config import ModelConfig
model = ZImage(
model_config=ModelConfig.z_image_turbo(),
quantize=8,
model_path="andrevp/Z-Image-Turbo-MLX-8bit",
)
image = model.generate_image(
seed=42,
prompt="A puffin standing on a cliff",
num_inference_steps=9,
height=1024,
width=1024,
)
image.save("output.png")
With LoRA
mflux-generate-z-image-turbo \
--model andrevp/Z-Image-Turbo-MLX-8bit \
--prompt "A woman in illustration style" \
--width 1024 --height 1024 \
--steps 9 --seed 42 \
--lora-paths your-org/your-lora \
--lora-scales 0.8
Or in Python:
model = ZImage(
model_config=ModelConfig.z_image_turbo(),
quantize=8,
model_path="andrevp/Z-Image-Turbo-MLX-8bit",
lora_paths=["your-org/your-lora"],
lora_scales=[0.8],
)
LoRA paths can be local files, HuggingFace repos (e.g. renderartist/Technically-Color-Z-Image-Turbo), or collection format (repo:filename.safetensors). You can stack multiple LoRAs by passing multiple paths and scales.
For training your own LoRA, see this guide.
thanks π
I try CLI Code:
mflux-generate-z-image-turbo
--model andrevp/Z-Image-Turbo-MLX \
--prompt "A puffin standing on a cliff"
--width 1024 --height 1024
--steps 9 --seed 42
this code work correctly.
but for 8-bit CLI Code:
mflux-generate-z-image-turbo
--model andrevp/Z-Image-Turbo-MLX-8bit
--prompt "A woman in illustration style"
--width 1024 --height 1024
--steps 9 --seed 42 --quantize 8
will show error:
Traceback (most recent call last):
File "/Users/mac/kuang/slides-video/.venv/bin/mflux-generate-z-image-turbo", line 6, in
sys.exit(main())
~~~~^^
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/cli/z_image_turbo_generate.py", line 48, in main
image = model.generate_image(
seed=seed,
...<8 lines>...
negative_prompt=args.negative_prompt,
)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/variants/z_image.py", line 92, in generate_image
text_encodings, negative_encodings = self._encode_prompts(
~~~~~~~~~~~~~~~~~~~~^
prompt=prompt,
^^^^^^^^^^^^^^
negative_prompt=negative_prompt,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
guidance=config.guidance,
^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/variants/z_image.py", line 158, in _encode_prompts
text_encodings = PromptEncoder.encode_prompt(
prompt=prompt,
tokenizer=self.tokenizers["z_image"],
text_encoder=self.text_encoder,
)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/prompt_encoder.py", line 15, in encode_prompt
cap_feats = text_encoder(output.input_ids, output.attention_mask)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/text_encoder.py", line 51, in call
hidden_states = layer(
hidden_states=hidden_states,
attention_mask=causal_mask,
position_embeddings=position_embeddings,
)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/encoder_layer.py", line 31, in call
hidden_states = self.self_attn(self.input_layernorm(hidden_states), attention_mask, position_embeddings)
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mflux/models/z_image/model/z_image_text_encoder/attention.py", line 36, in call
q = self.q_proj(hidden_states).reshape(batch_size, seq_len, self.num_heads, self.head_dim)
~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/Users/mac/kuang/slides-video/.venv/lib/python3.14/site-packages/mlx/nn/layers/quantized.py", line 266, in call
x = mx.quantized_matmul(
x,
...<6 lines>...
mode=self.mode,
)
ValueError: [quantized_matmul] Last dimension of first input with shape (..., 2560) does not match the expanded quantized matrix (640, 4096) computed from shape (4096,160) with group_size=64, bits=8 and transpose=true