Image-to-Image
Diffusers
TensorBoard
Safetensors
RSEditDiTPipeline
remote-sensing
image-editing
diffusion
Instructions to use BiliSakura/RSEdit-DiT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/RSEdit-DiT with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/RSEdit-DiT", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Improve model card: add metadata, links and sample usage
Browse filesHi! I'm Niels from the Hugging Face community science team. I've opened this PR to improve the model card for RSEdit-DiT. The updates include:
- Adding YAML metadata for `library_name` and `pipeline_tag`.
- Adding tags for `remote-sensing` and `image-editing` to help with discoverability.
- Including links to the research paper, code repository, and project page.
- Refining the sample usage snippet based on the official README.
These changes will help users better understand the model and how to use it with the `diffusers` library.
README.md
CHANGED
|
@@ -1,26 +1,42 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
|
| 6 |
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
|
| 9 |
```python
|
| 10 |
import torch
|
| 11 |
from PIL import Image
|
| 12 |
from diffusers import DiffusionPipeline
|
|
|
|
| 13 |
|
| 14 |
# Load model with custom pipeline
|
| 15 |
-
|
| 16 |
pipe = DiffusionPipeline.from_pretrained(
|
| 17 |
-
|
| 18 |
torch_dtype=torch.bfloat16,
|
| 19 |
custom_pipeline="pipeline_rsedit_dit"
|
| 20 |
).to("cuda")
|
| 21 |
|
| 22 |
# Switch to AttnProcessor (required for RSEdit DiT)
|
| 23 |
-
from diffusers.models.attention_processor import AttnProcessor
|
| 24 |
pipe.transformer.set_attn_processor(AttnProcessor())
|
| 25 |
|
| 26 |
# Load source image
|
|
@@ -41,3 +57,17 @@ edited_image = pipe(
|
|
| 41 |
# Save result
|
| 42 |
edited_image.save("edited_image.png")
|
| 43 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: diffusers
|
| 3 |
+
pipeline_tag: image-to-image
|
| 4 |
+
tags:
|
| 5 |
+
- remote-sensing
|
| 6 |
+
- image-editing
|
| 7 |
+
- diffusion
|
| 8 |
+
---
|
| 9 |
|
| 10 |
+
# RSEdit-DiT
|
| 11 |
|
| 12 |
+
RSEdit is a unified framework for instruction-based remote sensing image editing. This repository contains the DiT-based variant (based on PixArt-α) presented in the paper [RSEdit: Text-Guided Image Editing for Remote Sensing](https://huggingface.co/papers/2603.13708).
|
| 13 |
|
| 14 |
+
[**Project Page**](https://bili-sakura.github.io/RSEdit-Preview/) | [**Code**](https://github.com/Bili-Sakura/RSEdit-Preview) | [**Paper**](https://huggingface.co/papers/2603.13708)
|
| 15 |
+
|
| 16 |
+
## Model Description
|
| 17 |
+
General-domain text-guided image editors often introduce artifacts or break the orthographic constraints of remote sensing (RS) imagery. RSEdit addresses these challenges by adapting pretrained diffusion models into instruction-following editors via channel concatenation and in-context token concatenation.
|
| 18 |
+
|
| 19 |
+
The DiT-based variant leverages a transformer-based backbone to learn precise, physically coherent edits (e.g., flooding, urban growth, seasonal shifts) while preserving the geospatial content of the original image.
|
| 20 |
+
|
| 21 |
+
## Quick Start (Inference)
|
| 22 |
+
|
| 23 |
+
To run inference with the RSEdit-DiT model, use the `DiffusionPipeline` with the custom pipeline provided in the repository.
|
| 24 |
|
| 25 |
```python
|
| 26 |
import torch
|
| 27 |
from PIL import Image
|
| 28 |
from diffusers import DiffusionPipeline
|
| 29 |
+
from diffusers.models.attention_processor import AttnProcessor
|
| 30 |
|
| 31 |
# Load model with custom pipeline
|
| 32 |
+
model_id = "BiliSakura/RSEdit-DiT"
|
| 33 |
pipe = DiffusionPipeline.from_pretrained(
|
| 34 |
+
model_id,
|
| 35 |
torch_dtype=torch.bfloat16,
|
| 36 |
custom_pipeline="pipeline_rsedit_dit"
|
| 37 |
).to("cuda")
|
| 38 |
|
| 39 |
# Switch to AttnProcessor (required for RSEdit DiT)
|
|
|
|
| 40 |
pipe.transformer.set_attn_processor(AttnProcessor())
|
| 41 |
|
| 42 |
# Load source image
|
|
|
|
| 57 |
# Save result
|
| 58 |
edited_image.save("edited_image.png")
|
| 59 |
```
|
| 60 |
+
|
| 61 |
+
## Citation
|
| 62 |
+
|
| 63 |
+
```bibtex
|
| 64 |
+
@misc{zhenyuan2026rsedittextguidedimageediting,
|
| 65 |
+
title={RSEdit: Text-Guided Image Editing for Remote Sensing},
|
| 66 |
+
author={Chen Zhenyuan and Zhang Zechuan and Zhang Feng},
|
| 67 |
+
year={2026},
|
| 68 |
+
eprint={2603.13708},
|
| 69 |
+
archivePrefix={arXiv},
|
| 70 |
+
primaryClass={cs.CV},
|
| 71 |
+
url={https://arxiv.org/abs/2603.13708},
|
| 72 |
+
}
|
| 73 |
+
```
|