mingyi456
/

Z-Image-DF11

Diffusion Single File

Model card Files Files and versions

Z-Image-DF11 / README.md

mingyi456's picture

Update README.md

36d5825 verified 7 days ago

|

history blame contribute delete

3.52 kB

	---
	license: apache-2.0
	language:
	- en
	- zh
	base_model:
	- Tongyi-MAI/Z-Image
	base_model_relation: quantized
	pipeline_tag: text-to-image
	library_name: diffusers
	tags:
	- diffusion-single-file
	---
	For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

	Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.

	### How to Use

	#### `diffusers`

	```python
	import torch
	from diffusers import ZImagePipeline, ZImageTransformer2DModel
	from dfloat11 import DFloat11Model
	from transformers.modeling_utils import no_init_weights
	text_encoder = DFloat11Model.from_pretrained("DFloat11/Qwen3-4B-DF11", device="cpu")
	with no_init_weights():
	transformer = ZImageTransformer2DModel.from_config(
	ZImageTransformer2DModel.load_config(
	"Tongyi-MAI/Z-Image", subfolder="transformer"
	),
	torch_dtype=torch.bfloat16
	).to(torch.bfloat16)
	DFloat11Model.from_pretrained("mingyi456/Z-Image-DF11", device="cpu", bfloat16_model=transformer)
	pipe = ZImagePipeline.from_pretrained(
	"Tongyi-MAI/Z-Image",
	text_encoder=text_encoder,
	transformer=transformer,
	torch_dtype=torch.bfloat16,
	low_cpu_mem_usage=False,
	)
	pipe.to("cuda")

	prompt = "两名年轻亚裔女性紧密站在一起，背景为朴素的灰色纹理墙面，可能是室内地毯地面。左侧女性留着长卷发，身穿藏青色毛衣，左袖有奶油色褶皱装饰，内搭白色立领衬衫，下身白色裤子；佩戴小巧金色耳钉，双臂交叉于背后。右侧女性留直肩长发，身穿奶油色卫衣，胸前印有“Tun the tables”字样，下方为“New ideas”，搭配白色裤子；佩戴银色小环耳环，双臂交叉于胸前。两人均面带微笑直视镜头。照片，自然光照明，柔和阴影，以藏青、奶油白为主的中性色调，休闲时尚摄影，中等景深，面部和上半身对焦清晰，姿态放松，表情友好，室内环境，地毯地面，纯色背景。"
	negative_prompt = "" # Optional, but would be powerful when you want to remove some unwanted content
	image = pipe(
	prompt=prompt,
	negative_prompt=negative_prompt,
	height=1280,
	width=720,
	cfg_normalization=False,
	num_inference_steps=50,
	guidance_scale=4,
	generator=torch.Generator("cuda").manual_seed(42),
	).images[0]

	image.save("example.png")
	```

	#### ComfyUI
	Refer to this [model](https://huggingface.co/mingyi456/Z-Image-DF11-ComfyUI) instead.

	### Compression details

	This is the `pattern_dict` for compression:

	```python
	pattern_dict = {
	r"noise_refiner\.\d+": (
	"attention.to_q",
	"attention.to_k",
	"attention.to_v",
	"attention.to_out.0",
	"feed_forward.w1",
	"feed_forward.w2",
	"feed_forward.w3",
	"adaLN_modulation.0"
	),
	r"context_refiner\.\d+": (
	"attention.to_q",
	"attention.to_k",
	"attention.to_v",
	"attention.to_out.0",
	"feed_forward.w1",
	"feed_forward.w2",
	"feed_forward.w3",
	),
	r"layers\.\d+": (
	"attention.to_q",
	"attention.to_k",
	"attention.to_v",
	"attention.to_out.0",
	"feed_forward.w1",
	"feed_forward.w2",
	"feed_forward.w3",
	"adaLN_modulation.0"
	),
	r"cap_embedder": (
	"1",
	)
	}
	```