mingyi456 commited on
Commit
36d5825
·
verified ·
1 Parent(s): a65270a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -1
README.md CHANGED
@@ -10,4 +10,94 @@ pipeline_tag: text-to-image
10
  library_name: diffusers
11
  tags:
12
  - diffusion-single-file
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  library_name: diffusers
11
  tags:
12
  - diffusion-single-file
13
+ ---
14
+ For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11
15
+
16
+ Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.
17
+
18
+ ### How to Use
19
+
20
+ #### `diffusers`
21
+
22
+ ```python
23
+ import torch
24
+ from diffusers import ZImagePipeline, ZImageTransformer2DModel
25
+ from dfloat11 import DFloat11Model
26
+ from transformers.modeling_utils import no_init_weights
27
+ text_encoder = DFloat11Model.from_pretrained("DFloat11/Qwen3-4B-DF11", device="cpu")
28
+ with no_init_weights():
29
+ transformer = ZImageTransformer2DModel.from_config(
30
+ ZImageTransformer2DModel.load_config(
31
+ "Tongyi-MAI/Z-Image", subfolder="transformer"
32
+ ),
33
+ torch_dtype=torch.bfloat16
34
+ ).to(torch.bfloat16)
35
+ DFloat11Model.from_pretrained("mingyi456/Z-Image-DF11", device="cpu", bfloat16_model=transformer)
36
+ pipe = ZImagePipeline.from_pretrained(
37
+ "Tongyi-MAI/Z-Image",
38
+ text_encoder=text_encoder,
39
+ transformer=transformer,
40
+ torch_dtype=torch.bfloat16,
41
+ low_cpu_mem_usage=False,
42
+ )
43
+ pipe.to("cuda")
44
+
45
+ prompt = "两名年轻亚裔女性紧密站在一起,背景为朴素的灰色纹理墙面,可能是室内地毯地面。左侧女性留着长卷发,身穿藏青色毛衣,左袖有奶油色褶皱装饰,内搭白色立领衬衫,下身白色裤子;佩戴小巧金色耳钉,双臂交叉于背后。右侧女性留直肩长发,身穿奶油色卫衣,胸前印有“Tun the tables”字样,下方为“New ideas”,搭配白色裤子;佩戴银色小环耳环,双臂交叉于胸前。两人均面带微笑直视镜头。照片,自然光照明,柔和阴影,以藏青、奶油白为主的中性色调,休闲时尚摄影,中等景深,面部和上半身对焦清晰,姿态放松,表情友好,室内环境,地毯地面,纯色背景。"
46
+ negative_prompt = "" # Optional, but would be powerful when you want to remove some unwanted content
47
+ image = pipe(
48
+ prompt=prompt,
49
+ negative_prompt=negative_prompt,
50
+ height=1280,
51
+ width=720,
52
+ cfg_normalization=False,
53
+ num_inference_steps=50,
54
+ guidance_scale=4,
55
+ generator=torch.Generator("cuda").manual_seed(42),
56
+ ).images[0]
57
+
58
+ image.save("example.png")
59
+ ```
60
+
61
+ #### ComfyUI
62
+ Refer to this [model](https://huggingface.co/mingyi456/Z-Image-DF11-ComfyUI) instead.
63
+
64
+ ### Compression details
65
+
66
+ This is the `pattern_dict` for compression:
67
+
68
+ ```python
69
+ pattern_dict = {
70
+ r"noise_refiner\.\d+": (
71
+ "attention.to_q",
72
+ "attention.to_k",
73
+ "attention.to_v",
74
+ "attention.to_out.0",
75
+ "feed_forward.w1",
76
+ "feed_forward.w2",
77
+ "feed_forward.w3",
78
+ "adaLN_modulation.0"
79
+ ),
80
+ r"context_refiner\.\d+": (
81
+ "attention.to_q",
82
+ "attention.to_k",
83
+ "attention.to_v",
84
+ "attention.to_out.0",
85
+ "feed_forward.w1",
86
+ "feed_forward.w2",
87
+ "feed_forward.w3",
88
+ ),
89
+ r"layers\.\d+": (
90
+ "attention.to_q",
91
+ "attention.to_k",
92
+ "attention.to_v",
93
+ "attention.to_out.0",
94
+ "feed_forward.w1",
95
+ "feed_forward.w2",
96
+ "feed_forward.w3",
97
+ "adaLN_modulation.0"
98
+ ),
99
+ r"cap_embedder": (
100
+ "1",
101
+ )
102
+ }
103
+ ```