mingyi456
/

LongCat-Image-DF11

Diffusion Single File

Model card Files Files and versions

LongCat-Image-DF11 / README.md

mingyi456's picture

Update README.md

850a035 verified about 1 month ago

|

history blame contribute delete

3.16 kB

	---
	license: apache-2.0
	language:
	- en
	- zh
	base_model:
	- meituan-longcat/LongCat-Image
	base_model_relation: quantized
	pipeline_tag: text-to-image
	library_name: diffusers
	tags:
	- diffusion-single-file
	---
	For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

	Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.

	### How to Use

	#### `diffusers`

	```python
	import torch
	from diffusers import LongCatImagePipeline, LongCatImageTransformer2DModel
	from transformers.modeling_utils import no_init_weights

	with no_init_weights():
	transformer = LongCatImageTransformer2DModel.from_config(
	LongCatImageTransformer2DModel.load_config(
	"meituan-longcat/LongCat-Image", subfolder="transformer"
	),
	torch_dtype=torch.bfloat16
	).to(torch.bfloat16)

	DFloat11Model.from_pretrained(
	"mingyi456/LongCat-Image-DF11",
	device="cpu",
	bfloat16_model=transformer,
	)

	pipe = LongCatImagePipeline.from_pretrained(
	"meituan-longcat/LongCat-Image",
	transformer=transformer,
	torch_dtype=torch.bfloat16
	)
	DFloat11Model.from_pretrained(
	"mingyi456/Qwen2.5-VL-7B-Instruct-DF11",
	device="cpu",
	bfloat16_model=pipe.text_encoder,
	)

	pipe.enable_model_cpu_offload()
	prompt = '一个年轻的亚裔女性，身穿黄色针织衫，搭配白色项链。她的双手放在膝盖上，表情恬静。背景是一堵粗糙的砖墙，午后的阳光温暖地洒在她身上，营造出一种宁静而温馨的氛围。镜头采用中距离视角，突出她的神态和服饰的细节。光线柔和地打在她的脸上，强调她的五官和饰品的质感，增加画面的层次感与亲和力。整个画面构图简洁，砖墙的纹理与阳光的光影效果相得益彰，突显出人物的优雅与从容。'

	image = pipe(
	prompt,
	height=768,
	width=1344,
	guidance_scale=4.0,
	num_inference_steps=50,
	num_images_per_prompt=1,
	generator=torch.Generator("cpu").manual_seed(43),
	enable_cfg_renorm=True,
	enable_prompt_rewrite=True
	).images[0]
	image.save('image longcat-image.png')
	```

	#### ComfyUI
	Currently, this model is not supported natively in ComfyUI. Do let me know if it receives native support, and I will get to supporting it.

	### Compression details

	This is the `pattern_dict` for compression:

	```python
	pattern_dict = {
	r"transformer_blocks\.\d+": (
	"norm1.linear",
	"norm1_context.linear",
	"attn.to_q",
	"attn.to_k",
	"attn.to_v",
	"attn.to_out.0",
	"attn.add_q_proj",
	"attn.add_k_proj",
	"attn.add_v_proj",
	"attn.to_add_out",
	"ff.net.0.proj",
	"ff.net.2",
	"ff_context.net.0.proj",
	"ff_context.net.2",
	),
	r"single_transformer_blocks\.\d+": (
	"norm.linear",
	"proj_mlp",
	"proj_out",
	"attn.to_q",
	"attn.to_k",
	"attn.to_v",
	),
	}
	```