Husr commited on
Commit
93772d0
·
1 Parent(s): 5866927

基本代码

Browse files
Files changed (5) hide show
  1. .gitattributes +0 -34
  2. README.md +47 -12
  3. app.py +449 -4
  4. lora/.gitkeep +1 -0
  5. requirements.txt +8 -0
.gitattributes CHANGED
@@ -1,35 +1 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
- *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
  *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  *.safetensors filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,12 +1,47 @@
1
- ---
2
- title: Zig
3
- emoji: 🏃
4
- colorFrom: green
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 6.2.0
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Z-Image Hugging Face Space (No SD Fallback)
2
+
3
+ Gradio Space using the official Z-Image pipeline (`Tongyi-MAI/Z-Image-Turbo`) with optional LoRA injection (Civitai link you provided). There is **no SD1.5 fallback**—if the Z-Image model is unavailable, the Space will fail to load.
4
+
5
+ ## Files
6
+ - `app.py`: Z-Image pipeline, FlowMatch scheduler, LoRA toggle/strength, simple gallery UI.
7
+ - `requirements.txt`: Python deps for Spaces/local runs.
8
+ - `lora/`: Place `zit-mystic-xxx.safetensors` here (or point `LORA_PATH` to your filename).
9
+ - `.gitattributes`: Tracks `.safetensors` via Git LFS for large LoRA files.
10
+
11
+ ## Using on Hugging Face Spaces
12
+ 1) Create a Space (Python) and select a GPU hardware type.
13
+ 2) Add/clone this repo into the Space.
14
+ 3) Manually add the LoRA file from https://civitai.com/models/2206377/zit-mystic-xxx to `lora/zit-mystic-xxx.safetensors` (or set `LORA_PATH`). Network fetch of Civitai is not handled in the Space.
15
+ 4) (Recommended) Set `HF_TOKEN` in the Space secrets if the base model requires auth or to speed downloads.
16
+ 5) (Optional) Toggle advanced envs below; then the Space will launch `app.py`. The header shows whether the LoRA was detected/loaded.
17
+
18
+ ## Environment variables
19
+ - `MODEL_PATH` (default `Tongyi-MAI/Z-Image-Turbo`): HF repo or local path for the Z-Image model.
20
+ - `LORA_PATH` (default `lora/zit-mystic-xxx.safetensors`): Path to the LoRA file; loaded if present.
21
+ - `HF_TOKEN`: HF token for gated/private models or faster pulls.
22
+ - `ENABLE_COMPILE` (default `false`): Enable `torch.compile` on the transformer.
23
+ - `ENABLE_WARMUP` (default `false`): Run a quick warmup across resolutions after load (adds startup time).
24
+ - `ATTENTION_BACKEND` (default `flash_3`): Backend for transformer attention.
25
+ - `OFFLOAD_TO_CPU_AFTER_RUN` (default `true`): Move the model back to CPU after each generation to play nicer with ZeroGPU.
26
+ - `ENABLE_AOTI` (default `false`): Try to load ZeroGPU AoTI blocks via `spaces.aoti_blocks_load` for faster inference.
27
+ - `AOTI_REPO` (default `zerogpu-aoti/Z-Image`): AoTI blocks repo.
28
+ - `AOTI_VARIANT` (default `fa3`): AoTI variant.
29
+
30
+ ## Run locally
31
+ ```bash
32
+ python -m venv .venv
33
+ .venv\Scripts\activate # Windows; on Linux/macOS: source .venv/bin/activate
34
+ pip install -r requirements.txt
35
+ python app.py
36
+ ```
37
+ Place the LoRA file under `lora/` first (or set `LORA_PATH`); otherwise the app will run the base Z-Image model without LoRA.
38
+
39
+ ## UI controls
40
+ - Prompt
41
+ - Resolution category + explicit WxH selection
42
+ - Seed (with random toggle)
43
+ - Steps, time shift, max sequence length
44
+ - LoRA toggle + strength (enabled only if the file is found)
45
+
46
+ ## Git LFS note
47
+ `.gitattributes` tracks `.safetensors` with LFS. If you commit the LoRA, run `git lfs install` once before pushing so large files go through LFS.
app.py CHANGED
@@ -1,7 +1,452 @@
 
 
 
 
 
 
 
1
  import gradio as gr
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- def greet(name):
4
- return "Hello " + name + "!!"
 
5
 
6
- demo = gr.Interface(fn=greet, inputs="text", outputs="text")
7
- demo.launch()
 
1
+ import os
2
+ import random
3
+ import re
4
+ import threading
5
+ import warnings
6
+ from typing import List, Tuple
7
+
8
  import gradio as gr
9
+ import spaces
10
+ import torch
11
+ from diffusers import AutoencoderKL, FlowMatchEulerDiscreteScheduler, ZImagePipeline
12
+ from diffusers.models.transformers.transformer_z_image import ZImageTransformer2DModel
13
+ from transformers import AutoModelForCausalLM, AutoTokenizer
14
+
15
+ MODEL_PATH = os.environ.get("MODEL_PATH", "Tongyi-MAI/Z-Image-Turbo")
16
+ LORA_PATH = os.environ.get("LORA_PATH", os.path.join("lora", "zit-mystic-xxx.safetensors"))
17
+ HF_TOKEN = os.environ.get("HF_TOKEN")
18
+ ENABLE_COMPILE = os.environ.get("ENABLE_COMPILE", "false").lower() == "true"
19
+ ENABLE_WARMUP = os.environ.get("ENABLE_WARMUP", "false").lower() == "true"
20
+ ATTENTION_BACKEND = os.environ.get("ATTENTION_BACKEND", "flash_3")
21
+ OFFLOAD_TO_CPU_AFTER_RUN = os.environ.get("OFFLOAD_TO_CPU_AFTER_RUN", "true").lower() == "true"
22
+ ENABLE_AOTI = os.environ.get("ENABLE_AOTI", "false").lower() == "true"
23
+ AOTI_REPO = os.environ.get("AOTI_REPO", "zerogpu-aoti/Z-Image")
24
+ AOTI_VARIANT = os.environ.get("AOTI_VARIANT", "fa3")
25
+
26
+ warnings.filterwarnings("ignore")
27
+ os.environ["TOKENIZERS_PARALLELISM"] = "false"
28
+
29
+ RES_CHOICES = {
30
+ "1024": [
31
+ "1024x1024 ( 1:1 )",
32
+ "1152x896 ( 9:7 )",
33
+ "896x1152 ( 7:9 )",
34
+ "1152x864 ( 4:3 )",
35
+ "864x1152 ( 3:4 )",
36
+ "1248x832 ( 3:2 )",
37
+ "832x1248 ( 2:3 )",
38
+ "1280x720 ( 16:9 )",
39
+ "720x1280 ( 9:16 )",
40
+ "1344x576 ( 21:9 )",
41
+ "576x1344 ( 9:21 )",
42
+ ],
43
+ "1280": [
44
+ "1280x1280 ( 1:1 )",
45
+ "1440x1120 ( 9:7 )",
46
+ "1120x1440 ( 7:9 )",
47
+ "1472x1104 ( 4:3 )",
48
+ "1104x1472 ( 3:4 )",
49
+ "1536x1024 ( 3:2 )",
50
+ "1024x1536 ( 2:3 )",
51
+ "1536x864 ( 16:9 )",
52
+ "864x1536 ( 9:16 )",
53
+ "1680x720 ( 21:9 )",
54
+ "720x1680 ( 9:21 )",
55
+ ],
56
+ "1536": [
57
+ "1536x1536 ( 1:1 )",
58
+ "1728x1344 ( 9:7 )",
59
+ "1344x1728 ( 7:9 )",
60
+ "1728x1296 ( 4:3 )",
61
+ "1296x1728 ( 3:4 )",
62
+ "1872x1248 ( 3:2 )",
63
+ "1248x1872 ( 2:3 )",
64
+ "2048x1152 ( 16:9 )",
65
+ "1152x2048 ( 9:16 )",
66
+ "2016x864 ( 21:9 )",
67
+ "864x2016 ( 9:21 )",
68
+ ],
69
+ }
70
+
71
+ RESOLUTION_SET: List[str] = []
72
+ for resolutions in RES_CHOICES.values():
73
+ RESOLUTION_SET.extend(resolutions)
74
+
75
+ EXAMPLE_PROMPTS = [
76
+ ["一位男士和他的贵宾犬穿着配套的服装参加狗狗秀,室内灯光,背景中有观众。"],
77
+ [
78
+ "极具氛围感的暗调人像,一位优雅的中国美女在黑暗的房间里。一束强光通过遮光板,在她的脸上投射出一个清晰的闪电形状的光影,正好照亮一只眼睛。高对比度,明暗交界清晰,神秘感,莱卡相机色调。"
79
+ ],
80
+ [
81
+ "一张中景手机自拍照片拍摄了一位留着长黑发的年轻东亚女子在灯光明亮的电梯内对着镜子自拍。她穿着一件带有白色花朵图案的黑色露肩短上衣和深色牛仔裤。她的头微微倾斜,嘴唇嘟起做亲吻状,非常可爱俏皮。她右手拿着一部深灰色智能手机,遮住了部分脸,后置摄像头镜头对着镜子"
82
+ ],
83
+ [
84
+ "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."
85
+ ],
86
+ [
87
+ '''A vertical digital illustration depicting a serene and majestic Chinese landscape, rendered in a style reminiscent of traditional Shanshui painting but with a modern, clean aesthetic. The scene is dominated by towering, steep cliffs in various shades of blue and teal, which frame a central valley. In the distance, layers of mountains fade into a light blue and white mist, creating a strong sense of atmospheric perspective and depth. A calm, turquoise river flows through the center of the composition, with a small, traditional Chinese boat, possibly a sampan, navigating its waters. The boat has a bright yellow canopy and a red hull, and it leaves a gentle wake behind it. It carries several indistinct figures of people. Sparse vegetation, including green trees and some bare-branched trees, clings to the rocky ledges and peaks. The overall lighting is soft and diffused, casting a tranquil glow over the entire scene. Centered in the image is overlaid text. At the top of the text block is a small, red, circular seal-like logo containing stylized characters. Below it, in a smaller, black, sans-serif font, are the words "Zao-Xiang * East Beauty & West Fashion * Z-Image". Directly beneath this, in a larger, elegant black serif font, is the word "SHOW & SHARE CREATIVITY WITH THE WORLD". Among them, there are "SHOW & SHARE", "CREATIVITY", and "WITH THE WORLD"'''
88
+ ],
89
+ [
90
+ """一张��构的英语电影《回忆之味》(The Taste of Memory)的电影海报。场景设置在一个质朴的19世纪风格厨房里。画面中央,一位红棕色头发、留着小胡子的中年男子(演员阿瑟·彭哈利根饰)站在一张木桌后,他身穿白色衬衫、黑色马甲和米色围裙,正看着一位女士,手中拿着一大块生红肉,下方是一个木制切菜板。在他的右边,一位梳着高髻的黑发女子(演员埃莉诺·万斯饰)倚靠在桌子上,温柔地对他微笑。她穿着浅色衬衫和一条上白下蓝的长裙。桌上除了放有切碎的葱和卷心菜丝的切菜板外,还有一个白色陶瓷盘、新鲜香草,左侧一个木箱上放着一串深色葡萄。背景是一面粗糙的灰白色抹灰墙,墙上挂着一幅风景画。最右边的一个台面上放着一盏复古油灯。海报上有大量的文字信息。左上角是白色的无衬线字体"ARTISAN FILMS PRESENTS",其下方是"ELEANOR VANCE"和"ACADEMY AWARD® WINNER"。右上角写着"ARTHUR PENHALIGON"和"GOLDEN GLOBE® AWARD WINNER"。顶部中央是圣丹斯电影节的桂冠标志,下方写着"SUNDANCE FILM FESTIVAL GRAND JURY PRIZE 2024"。主标题"THE TASTE OF MEMORY"以白色的大号衬线字体醒目地显示在下半部分。标题下方注明了"A FILM BY Tongyi Interaction Lab"。底部区域用白色小字列出了完整的演职员名单,包括"SCREENPLAY BY ANNA REID"、"CULINARY DIRECTION BY JAMES CARTER"以及Artisan Films、Riverstone Pictures和Heritage Media等众多出品公司标志。整体风格是写实主义,采用温暖柔和的灯光方案,营造出一种亲密的氛围。色调以棕色、米色和柔和的绿色等大地色系为主。两位演员的身体都在腰部被截断。"""
91
+ ],
92
+ [
93
+ """一张方形构图的特写照片,主体是一片巨大的、鲜绿色的植物叶片,并叠加了文字,使其具有海报或杂志封面的外观。主要拍摄对象是一片厚实、有蜡质感的叶子,从左下角到右上角呈对角线弯曲穿过画面。其表面反光性很强,捕捉到一个明亮的直射光源,形成了一道突出的高光,亮面下显露出平行的精细叶脉。背景由其他深绿色的叶子组成,这些叶子轻微失焦,营造出浅景深效果,突出了前景的主叶片。整体风格是写实摄影,明亮的叶片与黑暗的阴影背景之间形成高对比度。图像上有多处渲染文字。左上角是白色的衬线字体文字"PIXEL-PEEPERS GUILD Presents"。右上角同样是白色衬线字体的文字"[Instant Noodle] 泡面调料包"。左侧垂直排列着标题"Render Distance: Max",为白色衬线字体。左下角是五个硕大的白色宋体汉字"显卡在...燃烧"。右下角是较小的白色衬线字体文字"Leica Glow™ Unobtanium X-1",其正上方是用白色宋体字书写的名字"蔡几"。识别出的核心实体包括品牌像素偷窥者协会、其产品线泡面调料包、相机型号买不到™ X-1以及摄影师名字造相。"""
94
+ ],
95
+ ]
96
+
97
+ pipe: ZImagePipeline | None = None
98
+ lora_loaded: bool = False
99
+ lora_error: str | None = None
100
+ pipe_lock = threading.Lock()
101
+ pipe_on_gpu: bool = False
102
+ aoti_loaded: bool = False
103
+
104
+
105
+ def parse_resolution(resolution: str) -> Tuple[int, int]:
106
+ match = re.search(r"(\d+)\s*[×x]\s*(\d+)", resolution)
107
+ if match:
108
+ return int(match.group(1)), int(match.group(2))
109
+ return 1024, 1024
110
+
111
+
112
+ def attach_lora(pipeline: ZImagePipeline) -> Tuple[bool, str | None]:
113
+ if not LORA_PATH or not os.path.isfile(LORA_PATH):
114
+ return False, "LoRA file not found"
115
+ try:
116
+ folder, weight_name = os.path.split(LORA_PATH)
117
+ folder = folder or "."
118
+ pipeline.load_lora_weights(folder, weight_name=weight_name)
119
+ set_lora_scale(pipeline, 1.0)
120
+ return True, None
121
+ except Exception as exc: # noqa: BLE001
122
+ return False, f"Failed to load LoRA: {exc}"
123
+
124
+
125
+ def set_lora_scale(pipeline: ZImagePipeline, scale: float) -> None:
126
+ weight = max(float(scale), 0.0)
127
+ try:
128
+ pipeline.set_adapters(["default"], adapter_weights=[weight])
129
+ except TypeError:
130
+ pipeline.set_adapters(["default"], weights=[weight])
131
+
132
+
133
+ def load_models() -> Tuple[ZImagePipeline, bool, str | None]:
134
+ global pipe, lora_loaded, lora_error
135
+ if pipe is not None:
136
+ return pipe, lora_loaded, lora_error
137
+
138
+ use_auth_token = HF_TOKEN if HF_TOKEN else True
139
+ print(f"Loading Z-Image from {MODEL_PATH}...")
140
+
141
+ if not os.path.exists(MODEL_PATH):
142
+ vae = AutoencoderKL.from_pretrained(
143
+ MODEL_PATH,
144
+ subfolder="vae",
145
+ torch_dtype=torch.bfloat16,
146
+ use_auth_token=use_auth_token,
147
+ )
148
+ text_encoder = AutoModelForCausalLM.from_pretrained(
149
+ MODEL_PATH,
150
+ subfolder="text_encoder",
151
+ torch_dtype=torch.bfloat16,
152
+ use_auth_token=use_auth_token,
153
+ ).eval()
154
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, subfolder="tokenizer", use_auth_token=use_auth_token)
155
+ else:
156
+ vae = AutoencoderKL.from_pretrained(os.path.join(MODEL_PATH, "vae"), torch_dtype=torch.bfloat16)
157
+ text_encoder = AutoModelForCausalLM.from_pretrained(
158
+ os.path.join(MODEL_PATH, "text_encoder"),
159
+ torch_dtype=torch.bfloat16,
160
+ ).eval()
161
+ tokenizer = AutoTokenizer.from_pretrained(os.path.join(MODEL_PATH, "tokenizer"))
162
+
163
+ tokenizer.padding_side = "left"
164
+
165
+ pipe = ZImagePipeline(scheduler=None, vae=vae, text_encoder=text_encoder, tokenizer=tokenizer, transformer=None)
166
+
167
+ if not os.path.exists(MODEL_PATH):
168
+ transformer = ZImageTransformer2DModel.from_pretrained(
169
+ MODEL_PATH,
170
+ subfolder="transformer",
171
+ use_auth_token=use_auth_token,
172
+ torch_dtype=torch.bfloat16,
173
+ )
174
+ else:
175
+ transformer = ZImageTransformer2DModel.from_pretrained(
176
+ os.path.join(MODEL_PATH, "transformer"),
177
+ torch_dtype=torch.bfloat16,
178
+ )
179
+
180
+ transformer.set_attention_backend(ATTENTION_BACKEND)
181
+
182
+ pipe.transformer = transformer
183
+
184
+ lora_loaded, lora_error = attach_lora(pipe)
185
+ if lora_error:
186
+ print(lora_error)
187
+ else:
188
+ print(f"LoRA loaded: {lora_loaded} ({LORA_PATH})")
189
+
190
+ return pipe, lora_loaded, lora_error
191
+
192
+
193
+ def ensure_models_loaded() -> Tuple[ZImagePipeline, bool, str | None]:
194
+ global pipe
195
+ if pipe is not None:
196
+ return pipe, lora_loaded, lora_error
197
+ with pipe_lock:
198
+ if pipe is not None:
199
+ return pipe, lora_loaded, lora_error
200
+ return load_models()
201
+
202
+
203
+ def ensure_on_gpu() -> None:
204
+ global pipe_on_gpu, aoti_loaded
205
+ if pipe is None:
206
+ raise gr.Error("Model not loaded.")
207
+ if not torch.cuda.is_available():
208
+ raise gr.Error("CUDA is not available. This Space requires a GPU.")
209
+ if pipe_on_gpu:
210
+ return
211
+
212
+ print("Moving model to GPU...")
213
+ pipe.to("cuda", torch.bfloat16)
214
+ pipe_on_gpu = True
215
+
216
+ if ENABLE_COMPILE:
217
+ print("Compiling transformer (torch.compile)...")
218
+ pipe.transformer = torch.compile(pipe.transformer, mode="max-autotune-no-cudagraphs", fullgraph=False)
219
+
220
+ if ENABLE_AOTI and not aoti_loaded:
221
+ try:
222
+ pipe.transformer.layers._repeated_blocks = ["ZImageTransformerBlock"]
223
+ spaces.aoti_blocks_load(pipe.transformer.layers, AOTI_REPO, variant=AOTI_VARIANT)
224
+ aoti_loaded = True
225
+ print(f"AoTI loaded: {AOTI_REPO} (variant={AOTI_VARIANT})")
226
+ except Exception as exc: # noqa: BLE001
227
+ print(f"AoTI load failed (continuing without AoTI): {exc}")
228
+
229
+
230
+ def offload_to_cpu() -> None:
231
+ global pipe_on_gpu
232
+ if pipe is None:
233
+ return
234
+ if not pipe_on_gpu:
235
+ return
236
+ print("Offloading model to CPU...")
237
+ pipe.to("cpu")
238
+ pipe_on_gpu = False
239
+ if torch.cuda.is_available():
240
+ torch.cuda.empty_cache()
241
+
242
+
243
+ def set_scheduler(pipeline: ZImagePipeline, shift: float) -> None:
244
+ scheduler = FlowMatchEulerDiscreteScheduler(num_train_timesteps=1000, shift=shift)
245
+ pipeline.scheduler = scheduler
246
+
247
+
248
+ def generate_image(
249
+ pipeline: ZImagePipeline,
250
+ prompt: str,
251
+ resolution: str,
252
+ seed: int,
253
+ steps: int,
254
+ shift: float,
255
+ guidance_scale: float,
256
+ max_sequence_length: int,
257
+ use_lora: bool,
258
+ lora_scale: float,
259
+ ) -> Tuple[torch.Tensor, int]:
260
+ width, height = parse_resolution(resolution)
261
+ generator = torch.Generator("cuda").manual_seed(seed)
262
+ set_scheduler(pipeline, shift)
263
+
264
+ if lora_loaded:
265
+ if use_lora:
266
+ set_lora_scale(pipeline, lora_scale)
267
+ else:
268
+ set_lora_scale(pipeline, 0.0)
269
+
270
+ with torch.inference_mode():
271
+ image = pipeline(
272
+ prompt=prompt,
273
+ height=height,
274
+ width=width,
275
+ guidance_scale=guidance_scale,
276
+ num_inference_steps=steps,
277
+ generator=generator,
278
+ max_sequence_length=max_sequence_length,
279
+ ).images[0]
280
+ return image, seed
281
+
282
+
283
+ def warmup_model(pipeline: ZImagePipeline, resolutions: List[str]) -> None:
284
+ print("Warmup started...")
285
+ dummy_prompt = "warmup"
286
+ for res_str in resolutions:
287
+ try:
288
+ generate_image(
289
+ pipeline,
290
+ prompt=dummy_prompt,
291
+ resolution=res_str,
292
+ seed=42,
293
+ steps=6,
294
+ shift=3.0,
295
+ guidance_scale=0.0,
296
+ max_sequence_length=512,
297
+ use_lora=False,
298
+ lora_scale=0.0,
299
+ )
300
+ except Exception as exc: # noqa: BLE001
301
+ print(f"Warmup failed for {res_str}: {exc}")
302
+ print("Warmup done.")
303
+
304
+
305
+ def init_app() -> None:
306
+ ensure_models_loaded()
307
+ if ENABLE_WARMUP and pipe is not None:
308
+ ensure_on_gpu()
309
+ try:
310
+ all_resolutions: List[str] = []
311
+ for cat in RES_CHOICES.values():
312
+ all_resolutions.extend(cat)
313
+ warmup_model(pipe, all_resolutions)
314
+ finally:
315
+ if OFFLOAD_TO_CPU_AFTER_RUN:
316
+ offload_to_cpu()
317
+
318
+
319
+ @spaces.GPU
320
+ def generate(
321
+ prompt: str,
322
+ resolution: str = "1024x1024 ( 1:1 )",
323
+ seed: int = 42,
324
+ steps: int = 9,
325
+ shift: float = 3.0,
326
+ random_seed: bool = True,
327
+ use_lora: bool = True,
328
+ lora_scale: float = 1.0,
329
+ max_sequence_length: int = 512,
330
+ gallery_images=None,
331
+ progress=gr.Progress(track_tqdm=True),
332
+ ):
333
+ ensure_models_loaded()
334
+ ensure_on_gpu()
335
+
336
+ new_seed = random.randint(1, 1_000_000) if random_seed or seed == -1 else int(seed)
337
+
338
+ try:
339
+ image = generate_image(
340
+ pipeline=pipe,
341
+ prompt=prompt,
342
+ resolution=resolution.split(" ")[0] if " " in resolution else resolution,
343
+ seed=new_seed,
344
+ steps=int(steps),
345
+ shift=float(shift),
346
+ guidance_scale=0.0,
347
+ max_sequence_length=int(max_sequence_length),
348
+ use_lora=use_lora,
349
+ lora_scale=float(lora_scale),
350
+ )[0]
351
+ finally:
352
+ if OFFLOAD_TO_CPU_AFTER_RUN:
353
+ offload_to_cpu()
354
+
355
+ if gallery_images is None:
356
+ gallery_images = []
357
+ gallery_images = [image] + gallery_images
358
+ return gallery_images, str(new_seed), int(new_seed)
359
+
360
+
361
+ init_app()
362
+
363
+ with gr.Blocks(title="Z-Image + LoRA") as demo:
364
+ pipe_status = "loaded (CPU)" if pipe else "not loaded"
365
+ lora_file_status = "found" if os.path.isfile(LORA_PATH) else "missing"
366
+ lora_status = f"LoRA file: {LORA_PATH} ({lora_file_status})"
367
+
368
+ gr.Markdown(
369
+ f"""<div align="center">
370
+
371
+ # Z-Image Generation (No SD fallback)
372
+
373
+ Model: `{MODEL_PATH}` | {pipe_status}
374
+ {lora_status}
375
+
376
+ </div>"""
377
+ )
378
+
379
+ with gr.Row():
380
+ with gr.Column(scale=1):
381
+ prompt_input = gr.Textbox(label="Prompt", lines=3, placeholder="请输入提示词")
382
+
383
+ with gr.Row():
384
+ choices = [int(k) for k in RES_CHOICES.keys()]
385
+ res_cat = gr.Dropdown(value=1024, choices=choices, label="Resolution Category")
386
+ resolution = gr.Dropdown(
387
+ value=RES_CHOICES["1024"][0],
388
+ choices=RESOLUTION_SET,
389
+ label="Width x Height (Ratio)",
390
+ )
391
+
392
+ with gr.Row():
393
+ seed = gr.Number(label="Seed", value=42, precision=0)
394
+ random_seed = gr.Checkbox(label="Random Seed", value=True)
395
+
396
+ with gr.Row():
397
+ steps = gr.Slider(label="Steps", minimum=1, maximum=100, value=9, step=1)
398
+ shift = gr.Slider(label="Time Shift", minimum=1.0, maximum=10.0, value=3.0, step=0.1)
399
+
400
+ with gr.Row():
401
+ max_seq = gr.Slider(label="Max Sequence Length", minimum=256, maximum=1024, value=512, step=16)
402
+
403
+ with gr.Row():
404
+ use_lora = gr.Checkbox(label="Use LoRA", value=True, interactive=lora_loaded)
405
+ lora_strength = gr.Slider(
406
+ label="LoRA Strength",
407
+ minimum=0.0,
408
+ maximum=1.5,
409
+ value=1.0,
410
+ step=0.05,
411
+ interactive=lora_loaded,
412
+ )
413
+
414
+ generate_btn = gr.Button("Generate", variant="primary")
415
+
416
+ gr.Markdown("### 示例提示词")
417
+ gr.Examples(examples=EXAMPLE_PROMPTS, inputs=prompt_input, label=None)
418
+
419
+ with gr.Column(scale=1):
420
+ output_gallery = gr.Gallery(
421
+ label="Generated Images",
422
+ columns=2,
423
+ rows=2,
424
+ height=600,
425
+ object_fit="contain",
426
+ format="png",
427
+ interactive=False,
428
+ )
429
+ used_seed = gr.Textbox(label="Seed Used", interactive=False)
430
+
431
+ def update_res_choices(_res_cat):
432
+ if str(_res_cat) in RES_CHOICES:
433
+ res_choices = RES_CHOICES[str(_res_cat)]
434
+ else:
435
+ res_choices = RES_CHOICES["1024"]
436
+ return gr.update(value=res_choices[0], choices=res_choices)
437
+
438
+ res_cat.change(update_res_choices, inputs=res_cat, outputs=resolution, api_visibility="private")
439
+
440
+ generate_btn.click(
441
+ generate,
442
+ inputs=[prompt_input, resolution, seed, steps, shift, random_seed, use_lora, lora_strength, max_seq, output_gallery],
443
+ outputs=[output_gallery, used_seed, seed],
444
+ api_visibility="public",
445
+ )
446
 
447
+ css = """
448
+ .fillable{max-width: 1230px !important}
449
+ """
450
 
451
+ if __name__ == "__main__":
452
+ demo.launch(css=css)
lora/.gitkeep ADDED
@@ -0,0 +1 @@
 
 
1
+ 
requirements.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ accelerate>=0.30.0
2
+ diffusers>=0.32.0
3
+ gradio>=4.44.0
4
+ Pillow>=10.0.0
5
+ safetensors>=0.4.2
6
+ spaces>=0.27.0
7
+ torch>=2.1.0
8
+ transformers>=4.41.0