RSEdit-DiT

RSEdit is a unified framework for instruction-based remote sensing image editing. This repository contains the DiT-based variant (based on PixArt-α) presented in the paper RSEdit: Text-Guided Image Editing for Remote Sensing.

Project Page | Code | Paper

Model Description

General-domain text-guided image editors often introduce artifacts or break the orthographic constraints of remote sensing (RS) imagery. RSEdit addresses these challenges by adapting pretrained diffusion models into instruction-following editors via channel concatenation and in-context token concatenation.

The DiT-based variant leverages a transformer-based backbone to learn precise, physically coherent edits (e.g., flooding, urban growth, seasonal shifts) while preserving the geospatial content of the original image.

Quick Start (Inference)

To run inference with the RSEdit-DiT model, use the DiffusionPipeline with the custom pipeline provided in the repository.

import torch
from PIL import Image
from diffusers import DiffusionPipeline
from diffusers.models.attention_processor import AttnProcessor

# Load model with custom pipeline
model_id = "BiliSakura/RSEdit-DiT"
pipe = DiffusionPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    custom_pipeline="pipeline_rsedit_dit"
).to("cuda")

# Switch to AttnProcessor (required for RSEdit DiT)
pipe.transformer.set_attn_processor(AttnProcessor())

# Load source image
source_image = Image.open("satellite_image.png").convert("RGB")

# Edit with instruction
prompt = "Flood the coastal area"
edited_image = pipe(
    prompt=prompt,
    source_image=source_image,
    num_inference_steps=50,
    guidance_scale=4.5,
    image_guidance_scale=1.5,
    height=512,
    width=512,
).images[0]

# Save result
edited_image.save("edited_image.png")

Citation

@misc{zhenyuan2026rsedittextguidedimageediting,
      title={RSEdit: Text-Guided Image Editing for Remote Sensing}, 
      author={Chen Zhenyuan and Zhang Zechuan and Zhang Feng},
      year={2026},
      eprint={2603.13708},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.13708}, 
}

Downloads last month: 26

Paper for BiliSakura/RSEdit-DiT

RSEdit: Text-Guided Image Editing for Remote Sensing

Paper • 2603.13708 • Published Mar 14