Introduction

We introduce LongCat-Image-Edit-Turbo, the distilled version of LongCat-Image-Edit. It achieves high-quality image editing with only 8 NFEs (Number of Function Evaluations) , offering extremely low inference latency.

Installation

pip install git+https://github.com/huggingface/diffusers

Run Image Editing

📝 Special Handling for Text Rendering

For both Text-to-Image and Image Editing tasks involving text generation, you must enclose the target text within single or double quotation marks (both English '...' / "..." and Chinese ‘...’ / “...” styles are supported).

Reasoning: The model utilizes a specialized character-level encoding strategy specifically for quoted content. Failure to use explicit quotation marks prevents this mechanism from triggering, which will severely compromise the text rendering capability.

import torch
from PIL import Image
from diffusers import LongCatImageEditPipeline

if __name__ == '__main__':
    device = torch.device('cuda')
    pipe = LongCatImageEditPipeline.from_pretrained("meituan-longcat/LongCat-Image-Edit-Turbo", torch_dtype= torch.bfloat16 )
    # pipe.to(device, torch.bfloat16)  # Uncomment for high VRAM devices (Faster inference)
    pipe.enable_model_cpu_offload()  # Offload to CPU to save VRAM (Required ~18 GB); slower but prevents OOM
    img = Image.open('assets/test.png').convert('RGB')
    prompt = '将猫变成狗'
    image = pipe(
        img,
        prompt,
        negative_prompt='',
        guidance_scale=1,
        num_inference_steps=8,
        num_images_per_prompt=1,
        generator=torch.Generator("cpu").manual_seed(43)
    ).images[0]
    image.save('./edit_example.png')

Downloads last month: 22,514

Model tree for meituan-longcat/LongCat-Image-Edit-Turbo

Finetunes

1 model

Quantizations

2 models

Spaces using meituan-longcat/LongCat-Image-Edit-Turbo 3

Paper for meituan-longcat/LongCat-Image-Edit-Turbo

LongCat-Image Technical Report

Paper • 2512.07584 • Published Dec 8, 2025 • 25