mingyi456 commited on
Commit
fe6849f
·
verified ·
1 Parent(s): b11f28b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -1
README.md CHANGED
@@ -10,4 +10,98 @@ pipeline_tag: text-to-image
10
  library_name: diffusers
11
  tags:
12
  - diffusion-single-file
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  library_name: diffusers
11
  tags:
12
  - diffusion-single-file
13
+ ---
14
+ For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11
15
+
16
+ Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.
17
+
18
+ ### How to Use
19
+
20
+ #### `diffusers`
21
+
22
+ ```python
23
+ import torch
24
+ from diffusers import ZImagePipeline, ZImageTransformer2DModel
25
+ from dfloat11 import DFloat11Model
26
+ from transformers.modeling_utils import no_init_weights
27
+
28
+ text_encoder = DFloat11Model.from_pretrained("DFloat11/Qwen3-4B-DF11", device="cpu")
29
+ with no_init_weights():
30
+ transformer = ZImageTransformer2DModel.from_config(
31
+ ZImageTransformer2DModel.load_config(
32
+ "Tongyi-MAI/Z-Image-Turbo", subfolder="transformer"
33
+ ),
34
+ torch_dtype=torch.bfloat16
35
+ ).to(torch.bfloat16)
36
+ DFloat11Model.from_pretrained("mingyi456/Z-Image-Turbo-DF11", device="cpu", bfloat16_model=transformer)
37
+
38
+ pipe = ZImagePipeline.from_pretrained(
39
+ "Tongyi-MAI/Z-Image-Turbo",
40
+ text_encoder=text_encoder,
41
+ transformer=transformer,
42
+ torch_dtype=torch.bfloat16,
43
+ low_cpu_mem_usage=False,
44
+ )
45
+ pipe.to("cuda")
46
+
47
+
48
+
49
+ prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."
50
+
51
+ # 2. Generate Image
52
+ image = pipe(
53
+ prompt=prompt,
54
+ height=1024,
55
+ width=1024,
56
+ num_inference_steps=9, # This actually results in 8 DiT forwards
57
+ guidance_scale=0.0, # Guidance should be 0 for the Turbo models
58
+ generator=torch.Generator("cuda").manual_seed(42),
59
+ ).images[0]
60
+
61
+ image.save("example.png")
62
+
63
+ ```
64
+
65
+ #### ComfyUI
66
+ Refer to this [model](mingyi456/Z-Image-Turbo-DF11-ComfyUI) instead.
67
+
68
+ ### Compression details
69
+
70
+ This is the `pattern_dict` for compression:
71
+
72
+ ```python
73
+ pattern_dict = {
74
+ r"noise_refiner\.\d+": (
75
+ "attention.to_q",
76
+ "attention.to_k",
77
+ "attention.to_v",
78
+ "attention.to_out.0",
79
+ "feed_forward.w1",
80
+ "feed_forward.w2",
81
+ "feed_forward.w3",
82
+ "adaLN_modulation.0"
83
+ ),
84
+ r"context_refiner\.\d+": (
85
+ "attention.to_q",
86
+ "attention.to_k",
87
+ "attention.to_v",
88
+ "attention.to_out.0",
89
+ "feed_forward.w1",
90
+ "feed_forward.w2",
91
+ "feed_forward.w3",
92
+ ),
93
+ r"layers\.\d+": (
94
+ "attention.to_q",
95
+ "attention.to_k",
96
+ "attention.to_v",
97
+ "attention.to_out.0",
98
+ "feed_forward.w1",
99
+ "feed_forward.w2",
100
+ "feed_forward.w3",
101
+ "adaLN_modulation.0"
102
+ ),
103
+ r"cap_embedder": (
104
+ "1",
105
+ )
106
+ }
107
+ ```