---
license: apache-2.0
language:
- en
- zh
base_model:
- Tongyi-MAI/Z-Image-Turbo
base_model_relation: quantized
pipeline_tag: text-to-image
library_name: diffusers
tags:
- diffusion-single-file
---
For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.

### How to Use

#### `diffusers`

```python
import torch
from diffusers import ZImagePipeline, ZImageTransformer2DModel
from dfloat11 import DFloat11Model
from transformers.modeling_utils import no_init_weights

text_encoder = DFloat11Model.from_pretrained("DFloat11/Qwen3-4B-DF11", device="cpu")
with no_init_weights():
	transformer = ZImageTransformer2DModel.from_config(
		ZImageTransformer2DModel.load_config(
			"Tongyi-MAI/Z-Image-Turbo", subfolder="transformer"
		),
		torch_dtype=torch.bfloat16
	).to(torch.bfloat16)
DFloat11Model.from_pretrained("mingyi456/Z-Image-Turbo-DF11", device="cpu", bfloat16_model=transformer)

pipe = ZImagePipeline.from_pretrained(
    "Tongyi-MAI/Z-Image-Turbo",
    text_encoder=text_encoder,
    transformer=transformer,
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=False,
)
pipe.to("cuda")

prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."

# 2. Generate Image
image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=9,  # This actually results in 8 DiT forwards
    guidance_scale=0.0,     # Guidance should be 0 for the Turbo models
    generator=torch.Generator("cuda").manual_seed(42),
).images[0]

image.save("example.png")

```

#### ComfyUI
Refer to this [model](https://huggingface.co/mingyi456/Z-Image-Turbo-DF11-ComfyUI) instead.

### Compression details

This is the `pattern_dict` for compression:

```python
pattern_dict = {
    r"noise_refiner\.\d+": (
        "attention.to_q",
        "attention.to_k",
        "attention.to_v",
        "attention.to_out.0",
        "feed_forward.w1",
        "feed_forward.w2",
        "feed_forward.w3",
        "adaLN_modulation.0"
    ),
    r"context_refiner\.\d+": (
        "attention.to_q",
        "attention.to_k",
        "attention.to_v",
        "attention.to_out.0",
        "feed_forward.w1",
        "feed_forward.w2",
        "feed_forward.w3",
    ),
    r"layers\.\d+": (
        "attention.to_q",
        "attention.to_k",
        "attention.to_v",
        "attention.to_out.0",
        "feed_forward.w1",
        "feed_forward.w2",
        "feed_forward.w3",
        "adaLN_modulation.0"
    ),
    r"cap_embedder": (
        "1",
    )
}
```