--- license: apache-2.0 language: - en - zh pipeline_tag: image-to-image library_name: transformers ---
LongCat-Image

[//]: # ( )
## Introduction We introduce **LongCat-Image-Edit-Turbo**, the distilled version of LongCat-Image-Edit. It achieves high-quality image editing with only 8 NFEs (Number of Function Evaluations) , offering extremely low inference latency.
LongCat-Image-Edit model
### Installation ```shell pip install git+https://github.com/huggingface/diffusers ``` ### Run Image Editing > [!CAUTION] > **📝 Special Handling for Text Rendering** > > For both Text-to-Image and Image Editing tasks involving text generation, **you must enclose the target text within single or double quotation marks** (both English '...' / "..." and Chinese ‘...’ / “...” styles are supported). > > **Reasoning:** The model utilizes a specialized **character-level encoding** strategy specifically for quoted content. Failure to use explicit quotation marks prevents this mechanism from triggering, which will severely compromise the text rendering capability. > ```python import torch from PIL import Image from diffusers import LongCatImageEditPipeline if __name__ == '__main__': device = torch.device('cuda') pipe = LongCatImageEditPipeline.from_pretrained("meituan-longcat/LongCat-Image-Edit-Turbo", torch_dtype= torch.bfloat16 ) # pipe.to(device, torch.bfloat16) # Uncomment for high VRAM devices (Faster inference) pipe.enable_model_cpu_offload() # Offload to CPU to save VRAM (Required ~18 GB); slower but prevents OOM img = Image.open('assets/test.png').convert('RGB') prompt = '将猫变成狗' image = pipe( img, prompt, negative_prompt='', guidance_scale=1, num_inference_steps=8, num_images_per_prompt=1, generator=torch.Generator("cpu").manual_seed(43) ).images[0] image.save('./edit_example.png') ```