kelseye commited on
Commit
0f3b583
·
verified ·
1 Parent(s): 460b4d3
.gitattributes CHANGED
@@ -33,3 +33,24 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/style/1/image_2.jpg filter=lfs diff=lfs merge=lfs -text
37
+ assets/style/2/0.jpg filter=lfs diff=lfs merge=lfs -text
38
+ assets/style/2/1.jpg filter=lfs diff=lfs merge=lfs -text
39
+ assets/style/2/2.jpg filter=lfs diff=lfs merge=lfs -text
40
+ assets/style/2/3.jpg filter=lfs diff=lfs merge=lfs -text
41
+ assets/style/2/4.jpg filter=lfs diff=lfs merge=lfs -text
42
+ assets/style/2/5.jpg filter=lfs diff=lfs merge=lfs -text
43
+ assets/style/3/image_0.jpg filter=lfs diff=lfs merge=lfs -text
44
+ assets/style/3/image_1.jpg filter=lfs diff=lfs merge=lfs -text
45
+ assets/style/4/0.jpg filter=lfs diff=lfs merge=lfs -text
46
+ assets/style/4/1.jpg filter=lfs diff=lfs merge=lfs -text
47
+ assets/style/4/2.jpg filter=lfs diff=lfs merge=lfs -text
48
+ assets/style/4/3.jpg filter=lfs diff=lfs merge=lfs -text
49
+ assets/style/4/4.jpg filter=lfs diff=lfs merge=lfs -text
50
+ assets/style/4/5.jpg filter=lfs diff=lfs merge=lfs -text
51
+ assets/style/4/image_0.jpg filter=lfs diff=lfs merge=lfs -text
52
+ assets/style/4/image_1.jpg filter=lfs diff=lfs merge=lfs -text
53
+ assets/style/4/image_2.jpg filter=lfs diff=lfs merge=lfs -text
54
+ assets/style/5/2.jpg filter=lfs diff=lfs merge=lfs -text
55
+ assets/style/5/image_0.jpg filter=lfs diff=lfs merge=lfs -text
56
+ assets/style/5/image_2.jpg filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,185 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+
3
+ frameworks:
4
+ - Pytorch
5
+ license: apache-2.0
6
+ tags: []
7
+ tasks:
8
+ - text-to-image-synthesis
9
+ base_model:
10
+ - Tongyi-MAI/Z-Image
11
+ base_model_relation: adapter
12
+ ---
13
+ ## 模型介绍
14
+
15
+ i2L (Image to LoRA) 模型是我们以疯狂的思路设计的模型结构。模型的输入为一张图片,输出为这张图片训练出的 LoRA 模型。本模型基于我们之前的 Qwen-Image-i2L([模型](https://modelscope.cn/models/DiffSynth-Studio/Qwen-Image-i2L)、[技术博客](https://modelscope.cn/learn/3343)),进一步完善并迁移到 [Z-Image](https://modelscope.cn/models/Tongyi-MAI/Z-Image),着重增强了模型的风格保持能力。
16
+
17
+ 为保证生成的图像质量,我们建议按以下参数使用本模型产生的 LoRA 模型:
18
+
19
+ * 使用负向提示词
20
+ * 中文:`"泛黄,发绿,模糊,低分辨率,低质量图像,扭曲的肢体,诡异的外观,丑陋,AI感,噪点,网格感,JPEG压缩条纹,异常的肢体,水印,乱码,意义不明的字符"`
21
+ * 英文:`"Yellowed, green-tinted, blurry, low-resolution, low-quality image, distorted limbs, eerie appearance, ugly, AI-looking, noise, grid-like artifacts, JPEG compression artifacts, abnormal limbs, watermark, garbled text, meaningless characters"`
22
+ * `cfg_scale = 4`
23
+ * `sigma_shift = 8`
24
+ * 仅在正向提示词侧启用 LoRA,在负向提示词侧关闭 LoRA,这会提升图像质量
25
+
26
+ 在线体验:https://modelscope.cn/studios/DiffSynth-Studio/Z-Image-i2L
27
+
28
+ ## 效果展示
29
+
30
+ Z-Image-i2L 模型可用于快速生成风格 LoRA,只需输入几张风格统一的图像。以下是我们生成的结果,随机种子都是 0。
31
+
32
+ ### 风格1:水彩绘画
33
+
34
+ 输入图像:
35
+
36
+ |![](./assets/style/1/0.jpg)|![](./assets/style/1/1.jpg)|![](./assets/style/1/2.jpg)|![](./assets/style/1/3.jpg)|
37
+ |-|-|-|-|
38
+
39
+ 生成图像:
40
+
41
+ |a cat|a dog|a girl|
42
+ |-|-|-|
43
+ |![](./assets/style/1/image_0.jpg)|![](./assets/style/1/image_1.jpg)|![](./assets/style/1/image_2.jpg)|
44
+
45
+ ### 风格2:写实细节
46
+
47
+ 输入图像:
48
+
49
+ |![](./assets/style/5/0.jpg)|![](./assets/style/5/1.jpg)|![](./assets/style/5/2.jpg)|![](./assets/style/5/3.jpg)|![](./assets/style/5/4.jpg)|
50
+ |-|-|-|-|-|
51
+
52
+ 生成图像:
53
+
54
+ |a cat|a dog|a girl|
55
+ |-|-|-|
56
+ |![](./assets/style/5/image_0.jpg)|![](./assets/style/5/image_1.jpg)|![](./assets/style/5/image_2.jpg)|
57
+
58
+ ### 风格3:缤纷色块
59
+
60
+ 输入图像:
61
+
62
+ |![](./assets/style/2/0.jpg)|![](./assets/style/2/1.jpg)|![](./assets/style/2/2.jpg)|![](./assets/style/2/3.jpg)|![](./assets/style/2/4.jpg)|![](./assets/style/2/5.jpg)|
63
+ |-|-|-|-|-|-|
64
+
65
+ 生成图像:
66
+
67
+ |a cat|a dog|a girl|
68
+ |-|-|-|
69
+ |![](./assets/style/2/image_0.jpg)|![](./assets/style/2/image_1.jpg)|![](./assets/style/2/image_2.jpg)|
70
+
71
+ ### 风格4:鲜花少女
72
+
73
+ 输入图像:
74
+
75
+ |![](./assets/style/3/0.jpg)|![](./assets/style/3/1.jpg)|![](./assets/style/3/2.jpg)|![](./assets/style/3/3.jpg)|
76
+ |-|-|-|-|
77
+
78
+ 生成图像:
79
+
80
+ |a cat|a dog|a girl|
81
+ |-|-|-|
82
+ |![](./assets/style/3/image_0.jpg)|![](./assets/style/3/image_1.jpg)|![](./assets/style/3/image_2.jpg)|
83
+
84
+ ### 风格5:黑白简约
85
+
86
+ 输入图像:
87
+
88
+ |![](./assets/style/6/0.jpg)|![](./assets/style/6/1.jpg)|![](./assets/style/6/2.jpg)|![](./assets/style/6/3.jpg)|
89
+ |-|-|-|-|
90
+
91
+ 生成图像:
92
+
93
+ |a cat|a dog|a girl|
94
+ |-|-|-|
95
+ |![](./assets/style/6/image_0.jpg)|![](./assets/style/6/image_1.jpg)|![](./assets/style/6/image_2.jpg)|
96
+
97
+ ### 风格6:幻想世界
98
+
99
+ 输入图像:
100
+
101
+ |![](./assets/style/4/0.jpg)|![](./assets/style/4/1.jpg)|![](./assets/style/4/2.jpg)|![](./assets/style/4/3.jpg)|![](./assets/style/4/4.jpg)|![](./assets/style/4/5.jpg)|
102
+ |-|-|-|-|-|-|
103
+
104
+ 生成图像:
105
+
106
+ |a cat|a dog|a girl|
107
+ |-|-|-|
108
+ |![](./assets/style/4/image_0.jpg)|![](./assets/style/4/image_1.jpg)|![](./assets/style/4/image_2.jpg)|
109
+
110
+ ## 推理代码
111
+
112
+ 安装 [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio):
113
+
114
+ ```shell
115
+ git clone https://github.com/modelscope/DiffSynth-Studio.git
116
+ cd DiffSynth-Studio
117
+ pip install -e .
118
+ ```
119
+
120
+ 模型推理:
121
+
122
+ ```python
123
+ from diffsynth.pipelines.z_image import (
124
+ ZImagePipeline, ModelConfig,
125
+ ZImageUnit_Image2LoRAEncode, ZImageUnit_Image2LoRADecode
126
+ )
127
+ from modelscope import snapshot_download
128
+ from safetensors.torch import save_file
129
+ import torch
130
+ from PIL import Image
131
+
132
+ # Use `vram_config` to enable LoRA hot-loading
133
+ vram_config = {
134
+ "offload_dtype": torch.bfloat16,
135
+ "offload_device": "cuda",
136
+ "onload_dtype": torch.bfloat16,
137
+ "onload_device": "cuda",
138
+ "preparing_dtype": torch.bfloat16,
139
+ "preparing_device": "cuda",
140
+ "computation_dtype": torch.bfloat16,
141
+ "computation_device": "cuda",
142
+ }
143
+
144
+ # Load models
145
+ pipe = ZImagePipeline.from_pretrained(
146
+ torch_dtype=torch.bfloat16,
147
+ device="cuda",
148
+ model_configs=[
149
+ ModelConfig(model_id="Tongyi-MAI/Z-Image", origin_file_pattern="transformer/*.safetensors", **vram_config),
150
+ ModelConfig(model_id="Tongyi-MAI/Z-Image-Turbo", origin_file_pattern="text_encoder/*.safetensors"),
151
+ ModelConfig(model_id="Tongyi-MAI/Z-Image-Turbo", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
152
+ ModelConfig(model_id="DiffSynth-Studio/General-Image-Encoders", origin_file_pattern="SigLIP2-G384/model.safetensors"),
153
+ ModelConfig(model_id="DiffSynth-Studio/General-Image-Encoders", origin_file_pattern="DINOv3-7B/model.safetensors"),
154
+ ModelConfig(model_id="DiffSynth-Studio/Z-Image-i2L", origin_file_pattern="model.safetensors"),
155
+ ],
156
+ tokenizer_config=ModelConfig(model_id="Tongyi-MAI/Z-Image-Turbo", origin_file_pattern="tokenizer/"),
157
+ )
158
+
159
+ # Load images
160
+ snapshot_download(
161
+ model_id="DiffSynth-Studio/Z-Image-i2L",
162
+ allow_file_pattern="assets/style/*",
163
+ local_dir="data/Z-Image-i2L_style_input"
164
+ )
165
+ images = [Image.open(f"data/Z-Image-i2L_style_input/assets/style/1/{i}.jpg") for i in range(4)]
166
+
167
+ # Image to LoRA
168
+ with torch.no_grad():
169
+ embs = ZImageUnit_Image2LoRAEncode().process(pipe, image2lora_images=images)
170
+ lora = ZImageUnit_Image2LoRADecode().process(pipe, **embs)["lora"]
171
+ save_file(lora, "lora.safetensors")
172
+
173
+ # Generate images
174
+ prompt = "a cat"
175
+ negative_prompt = "泛黄,发绿,模糊,低分辨率,低质量图像,扭曲的肢体,诡异的外观,丑陋,AI感,噪点,网格感,JPEG压缩条纹,异常的肢体,水印,乱码,意义不明的字符"
176
+ image = pipe(
177
+ prompt=prompt,
178
+ negative_prompt=negative_prompt,
179
+ seed=0, cfg_scale=4, num_inference_steps=50,
180
+ positive_only_lora=lora,
181
+ sigma_shift=8
182
+ )
183
+ image.save("image.jpg")
184
+ ```
185
+
assets/style/1/0.jpg ADDED
assets/style/1/1.jpg ADDED
assets/style/1/2.jpg ADDED
assets/style/1/3.jpg ADDED
assets/style/1/image_0.jpg ADDED
assets/style/1/image_1.jpg ADDED
assets/style/1/image_2.jpg ADDED

Git LFS Details

  • SHA256: 18f33692a014590a74c683696ef4dbeab96c9133e77c2801fe09a59b1ed8bba6
  • Pointer size: 131 Bytes
  • Size of remote file: 102 kB
assets/style/2/0.jpg ADDED

Git LFS Details

  • SHA256: 5ecf797bb4ecd17a7b11b925be9db3f09db223dc6290ee79636929a7384cb1e9
  • Pointer size: 131 Bytes
  • Size of remote file: 156 kB
assets/style/2/1.jpg ADDED

Git LFS Details

  • SHA256: 72c8cbb1b21f6ab548c255fcf7a2fb09bffa6047a2fa3bab66abe3f801f0502e
  • Pointer size: 131 Bytes
  • Size of remote file: 148 kB
assets/style/2/2.jpg ADDED

Git LFS Details

  • SHA256: 9bca0e084d86c784a35d003492bbe036f57b06a2a18f1dd2f9065d1bc66095df
  • Pointer size: 131 Bytes
  • Size of remote file: 157 kB
assets/style/2/3.jpg ADDED

Git LFS Details

  • SHA256: 5103ade5444e8ac9fcf70807634de2d726854a2d052d6f7e71ca418d6d802c7d
  • Pointer size: 131 Bytes
  • Size of remote file: 190 kB
assets/style/2/4.jpg ADDED

Git LFS Details

  • SHA256: b02e1e55189e5116892e81b0f52a5d62ca4f890ff513d03fba9f006cf41cead2
  • Pointer size: 131 Bytes
  • Size of remote file: 123 kB
assets/style/2/5.jpg ADDED

Git LFS Details

  • SHA256: 46cf6803e623b034d91b5ba97cd697315d250588b15bb89b930b3fce0feb8acb
  • Pointer size: 131 Bytes
  • Size of remote file: 114 kB
assets/style/2/image_0.jpg ADDED
assets/style/2/image_1.jpg ADDED
assets/style/2/image_2.jpg ADDED
assets/style/3/0.jpg ADDED
assets/style/3/1.jpg ADDED
assets/style/3/2.jpg ADDED
assets/style/3/3.jpg ADDED
assets/style/3/image_0.jpg ADDED

Git LFS Details

  • SHA256: eb53287ae21109b7488d9a50f3de36a041ae9e4eaebc909e3c41a9ff38fceafa
  • Pointer size: 131 Bytes
  • Size of remote file: 116 kB
assets/style/3/image_1.jpg ADDED

Git LFS Details

  • SHA256: 442c13b67dddaf14c3b57c4ea1e7c1b1202a03d54eab94d2ba60f2c7e93c75c1
  • Pointer size: 131 Bytes
  • Size of remote file: 104 kB
assets/style/3/image_2.jpg ADDED
assets/style/4/0.jpg ADDED

Git LFS Details

  • SHA256: afd40d238a4f67bc1f6ce9a799b7fd60b2e78b90e7d1fed1dde1c3d96dd8b058
  • Pointer size: 131 Bytes
  • Size of remote file: 180 kB
assets/style/4/1.jpg ADDED

Git LFS Details

  • SHA256: bee3d192a461af8172a37c1c53e67cc08966f58bc1107884d1ed752a1bc968d0
  • Pointer size: 131 Bytes
  • Size of remote file: 195 kB
assets/style/4/2.jpg ADDED

Git LFS Details

  • SHA256: cec963bb4141e41984e09b793c81e009e67e1e0b466e4d28c57ea014a8f64f6c
  • Pointer size: 131 Bytes
  • Size of remote file: 164 kB
assets/style/4/3.jpg ADDED

Git LFS Details

  • SHA256: 72d18df9b4c868a03e690b3ceba453910e4da5843e3d12948b732edd6666451a
  • Pointer size: 131 Bytes
  • Size of remote file: 155 kB
assets/style/4/4.jpg ADDED

Git LFS Details

  • SHA256: 1bb9a0c4db88eddccb583897c7b2c15fa6f0e389aace6b9a1d56fdbcd7f2cf8a
  • Pointer size: 131 Bytes
  • Size of remote file: 226 kB
assets/style/4/5.jpg ADDED

Git LFS Details

  • SHA256: 782df31862cfbbdb9a6cb68c0a7ca2781a7a5eaa559096ae282ce1480dc91e90
  • Pointer size: 131 Bytes
  • Size of remote file: 200 kB
assets/style/4/image_0.jpg ADDED

Git LFS Details

  • SHA256: 870822f52bc42b4c15a4029e3e8407332c666b7a73dc3c79d2eee15c74a97f36
  • Pointer size: 131 Bytes
  • Size of remote file: 147 kB
assets/style/4/image_1.jpg ADDED

Git LFS Details

  • SHA256: 0f0af1cf0b673c42a415030dd8bfd73bfb04feda67a1c01ecccc33f24d59df9d
  • Pointer size: 131 Bytes
  • Size of remote file: 137 kB
assets/style/4/image_2.jpg ADDED

Git LFS Details

  • SHA256: ae9e3f17012dbd7eb92a4e8e3eb0800f206f6ddfdccca7939ed75d1ed228d798
  • Pointer size: 131 Bytes
  • Size of remote file: 152 kB
assets/style/5/0.jpg ADDED
assets/style/5/1.jpg ADDED
assets/style/5/2.jpg ADDED

Git LFS Details

  • SHA256: e18aee2d25063af1244c080849388371aa946ebb9ae012348df5447db94be6bb
  • Pointer size: 131 Bytes
  • Size of remote file: 141 kB
assets/style/5/3.jpg ADDED
assets/style/5/4.jpg ADDED
assets/style/5/image_0.jpg ADDED

Git LFS Details

  • SHA256: 28b5a4192550db3c71c0cda804e82e967772ae17852a6f16f7e665cad76bbc81
  • Pointer size: 131 Bytes
  • Size of remote file: 101 kB
assets/style/5/image_1.jpg ADDED
assets/style/5/image_2.jpg ADDED

Git LFS Details

  • SHA256: 940e1dc50224367cc9e1f1c36beadf679e2b0a1026264b9246c26f36c31ee1fb
  • Pointer size: 131 Bytes
  • Size of remote file: 118 kB
assets/style/6/0.jpg ADDED
assets/style/6/1.jpg ADDED
assets/style/6/2.jpg ADDED
assets/style/6/3.jpg ADDED
assets/style/6/image_0.jpg ADDED
assets/style/6/image_1.jpg ADDED
assets/style/6/image_2.jpg ADDED
configuration.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"framework":"Pytorch","task":"text-to-image-synthesis"}