mingyi456 commited on
Commit
e7306f6
·
verified ·
1 Parent(s): 6694446

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -1
README.md CHANGED
@@ -10,4 +10,91 @@ pipeline_tag: image-text-to-image
10
  library_name: diffusers
11
  tags:
12
  - diffusion-single-file
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  library_name: diffusers
11
  tags:
12
  - diffusion-single-file
13
+ ---
14
+ For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11
15
+
16
+ Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.
17
+
18
+ ### How to Use
19
+
20
+ #### `diffusers`
21
+
22
+ ```python
23
+ import torch
24
+ from diffusers import LongCatImageEditPipeline, LongCatImageTransformer2DModel
25
+
26
+ # for newer versions of `transformers`, it seems that from transformers.initialization import no_init_weights is required instead
27
+ from transformers.modeling_utils import no_init_weights
28
+
29
+ with no_init_weights():
30
+ transformer = LongCatImageTransformer2DModel.from_config(
31
+ LongCatImageTransformer2DModel.load_config(
32
+ "meituan-longcat/LongCat-Image-Edit-Turbo", subfolder="transformer"
33
+ ),
34
+ torch_dtype=torch.bfloat16
35
+ ).to(torch.bfloat16)
36
+ DFloat11Model.from_pretrained(
37
+ "mingyi456/LongCat-Image-Edit-Turbo-DF11",
38
+ device="cpu",
39
+ bfloat16_model=transformer,
40
+ )
41
+ pipe = LongCatImageEditPipeline.from_pretrained(
42
+ "meituan-longcat/LongCat-Image-Edit-Turbo",
43
+ transformer=transformer,
44
+ torch_dtype=torch.bfloat16
45
+ )
46
+ DFloat11Model.from_pretrained(
47
+ "mingyi456/Qwen2.5-VL-7B-Instruct-DF11",
48
+ device="cpu",
49
+ bfloat16_model=pipe.text_encoder,
50
+ )
51
+ pipe.enable_model_cpu_offload()
52
+ img = Image.open('assets/test.png').convert('RGB')
53
+ prompt = '将猫变成狗'
54
+ image = pipe(
55
+ img,
56
+ prompt,
57
+ negative_prompt='',
58
+ guidance_scale=1.0,
59
+ num_inference_steps=8,
60
+ num_images_per_prompt=1,
61
+ generator=torch.Generator("cpu").manual_seed(43)
62
+ ).images[0]
63
+ image.save('image longcat-image-edit.png')
64
+ ```
65
+
66
+ #### ComfyUI
67
+ Currently, this model is not supported natively in ComfyUI. Do let me know if it receives native support, and I will get to supporting it.
68
+
69
+ ### Compression details
70
+
71
+ This is the `pattern_dict` for compression:
72
+
73
+ ```python
74
+ pattern_dict = {
75
+ r"transformer_blocks\.\d+": (
76
+ "norm1.linear",
77
+ "norm1_context.linear",
78
+ "attn.to_q",
79
+ "attn.to_k",
80
+ "attn.to_v",
81
+ "attn.to_out.0",
82
+ "attn.add_q_proj",
83
+ "attn.add_k_proj",
84
+ "attn.add_v_proj",
85
+ "attn.to_add_out",
86
+ "ff.net.0.proj",
87
+ "ff.net.2",
88
+ "ff_context.net.0.proj",
89
+ "ff_context.net.2",
90
+ ),
91
+ r"single_transformer_blocks\.\d+": (
92
+ "norm.linear",
93
+ "proj_mlp",
94
+ "proj_out",
95
+ "attn.to_q",
96
+ "attn.to_k",
97
+ "attn.to_v",
98
+ ),
99
+ }
100
+ ```