--- license: apache-2.0 language: - en - zh base_model: - meituan-longcat/LongCat-Image base_model_relation: quantized pipeline_tag: text-to-image library_name: diffusers tags: - diffusion-single-file --- For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11 Feel free to request for other models for compression as well (for either the `diffusers` library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult. ### How to Use #### `diffusers` ```python import torch from diffusers import LongCatImagePipeline, LongCatImageTransformer2DModel from transformers.modeling_utils import no_init_weights with no_init_weights(): transformer = LongCatImageTransformer2DModel.from_config( LongCatImageTransformer2DModel.load_config( "meituan-longcat/LongCat-Image", subfolder="transformer" ), torch_dtype=torch.bfloat16 ).to(torch.bfloat16) DFloat11Model.from_pretrained( "mingyi456/LongCat-Image-DF11", device="cpu", bfloat16_model=transformer, ) pipe = LongCatImagePipeline.from_pretrained( "meituan-longcat/LongCat-Image", transformer=transformer, torch_dtype=torch.bfloat16 ) DFloat11Model.from_pretrained( "mingyi456/Qwen2.5-VL-7B-Instruct-DF11", device="cpu", bfloat16_model=pipe.text_encoder, ) pipe.enable_model_cpu_offload() prompt = '一个年轻的亚裔女性,身穿黄色针织衫,搭配白色项链。她的双手放在膝盖上,表情恬静。背景是一堵粗糙的砖墙,午后的阳光温暖地洒在她身上,营造出一种宁静而温馨的氛围。镜头采用中距离视角,突出她的神态和服饰的细节。光线柔和地打在她的脸上,强调她的五官和饰品的质感,增加画面的层次感与亲和力。整个画面构图简洁,砖墙的纹理与阳光的光影效果相得益彰,突显出人物的优雅与从容。' image = pipe( prompt, height=768, width=1344, guidance_scale=4.0, num_inference_steps=50, num_images_per_prompt=1, generator=torch.Generator("cpu").manual_seed(43), enable_cfg_renorm=True, enable_prompt_rewrite=True ).images[0] image.save('image longcat-image.png') ``` #### ComfyUI Currently, this model is not supported natively in ComfyUI. Do let me know if it receives native support, and I will get to supporting it. ### Compression details This is the `pattern_dict` for compression: ```python pattern_dict = { r"transformer_blocks\.\d+": ( "norm1.linear", "norm1_context.linear", "attn.to_q", "attn.to_k", "attn.to_v", "attn.to_out.0", "attn.add_q_proj", "attn.add_k_proj", "attn.add_v_proj", "attn.to_add_out", "ff.net.0.proj", "ff.net.2", "ff_context.net.0.proj", "ff_context.net.2", ), r"single_transformer_blocks\.\d+": ( "norm.linear", "proj_mlp", "proj_out", "attn.to_q", "attn.to_k", "attn.to_v", ), } ```