--- license: apache-2.0 language: - en - zh base_model: - Tongyi-MAI/Z-Image-Turbo base_model_relation: quantized pipeline_tag: text-to-image library_name: diffusers tags: - diffusion-single-file --- For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11 Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult. ### How to Use #### `diffusers` ```python import torch from diffusers import ZImagePipeline, ZImageTransformer2DModel from dfloat11 import DFloat11Model from transformers.modeling_utils import no_init_weights text_encoder = DFloat11Model.from_pretrained("DFloat11/Qwen3-4B-DF11", device="cpu") with no_init_weights(): transformer = ZImageTransformer2DModel.from_config( ZImageTransformer2DModel.load_config( "Tongyi-MAI/Z-Image-Turbo", subfolder="transformer" ), torch_dtype=torch.bfloat16 ).to(torch.bfloat16) DFloat11Model.from_pretrained("mingyi456/Z-Image-Turbo-DF11", device="cpu", bfloat16_model=transformer) pipe = ZImagePipeline.from_pretrained( "Tongyi-MAI/Z-Image-Turbo", text_encoder=text_encoder, transformer=transformer, torch_dtype=torch.bfloat16, low_cpu_mem_usage=False, ) pipe.to("cuda") prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights." # 2. Generate Image image = pipe( prompt=prompt, height=1024, width=1024, num_inference_steps=9, # This actually results in 8 DiT forwards guidance_scale=0.0, # Guidance should be 0 for the Turbo models generator=torch.Generator("cuda").manual_seed(42), ).images[0] image.save("example.png") ``` #### ComfyUI Refer to this [model](https://huggingface.co/mingyi456/Z-Image-Turbo-DF11-ComfyUI) instead. ### Compression details This is the `pattern_dict` for compression: ```python pattern_dict = { r"noise_refiner\.\d+": ( "attention.to_q", "attention.to_k", "attention.to_v", "attention.to_out.0", "feed_forward.w1", "feed_forward.w2", "feed_forward.w3", "adaLN_modulation.0" ), r"context_refiner\.\d+": ( "attention.to_q", "attention.to_k", "attention.to_v", "attention.to_out.0", "feed_forward.w1", "feed_forward.w2", "feed_forward.w3", ), r"layers\.\d+": ( "attention.to_q", "attention.to_k", "attention.to_v", "attention.to_out.0", "feed_forward.w1", "feed_forward.w2", "feed_forward.w3", "adaLN_modulation.0" ), r"cap_embedder": ( "1", ) } ```