File size: 2,892 Bytes
be4c430
34da673
 
 
be4c430
 
34da673
2d2059a
34da673
be4c430
076d26b
af62223
34da673
 
076d26b
34da673
1fa0058
34da673
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
076d26b
34da673
 
1fa0058
be4c430
253189c
be4c430
253189c
be4c430
 
 
 
 
253189c
be4c430
253189c
be4c430
 
 
 
253189c
be4c430
1fa0058
be4c430
253189c
 
 
 
 
 
 
be4c430
1fa0058
be4c430
253189c
be4c430
 
 
 
 
253189c
be4c430
 
 
 
 
 
 
 
 
 
 
34da673
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
license: cc
tags:
- image-to-image
---

# REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Project page: https://peter-sushko.github.io/RealEdit/  
Data: https://huggingface.co/datasets/peter-sushko/RealEdit

<img src="https://peter-sushko.github.io/RealEdit/static/images/teaser.svg"/>  



**There are 2 ways to run inference: either via Diffusers or original InstructPix2Pix pipeline.**

## Option 1: With 🧨Diffusers:

Install diffusers, transformers library:

```bash
pip install diffusers accelerate safetensors transformers
```

Download weights adapted for diffusers:

WEIGHTS


```python
import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, safety_checker=None)

CODE TO RUN IT:


pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)

url = "https://raw.githubusercontent.com/timothybrooks/instruct-pix2pix/main/imgs/example.jpg"
def download_image(url):
    image = PIL.Image.open(requests.get(url, stream=True).raw)
    image = PIL.ImageOps.exif_transpose(image)
    image = image.convert("RGB")
    return image
image = download_image(url)

prompt = "turn him into cyborg"
images = pipe(prompt, image=image, num_inference_steps=10, image_guidance_scale=1).images
images[0]
```  


## Option 2: via InstructPix2Pix pipeline:

Clone the repository and set up the directory structure:

```bash
git clone https://github.com/timothybrooks/instruct-pix2pix.git
cd instruct-pix2pix
mkdir checkpoints
```

Download the fine-tuned checkpoint into the `checkpoints` directory:

```bash
cd checkpoints
# wget https://huggingface.co/peter-sushko/RealEdit/resolve/main/realedit_model.ckpt
```

Return to the repo root and follow the [InstructPix2Pix installation guide](https://github.com/timothybrooks/instruct-pix2pix) to set up the environment.

Edit a single image

```bash
python edit_cli.py \
  --input [YOUR_IMG_PATH] \
  --output imgs/output.jpg \
  --edit "YOUR EDIT INSTRUCTION" \
  --ckpt checkpoints/realedit_model.ckpt
```

Launch the Gradio interface

```bash
python edit_app.py --ckpt checkpoints/realedit_model.ckpt
```

## Citation

If you find this checkpoint helpful, please cite:

```
@misc{sushko2025realeditredditeditslargescale,
      title={REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations}, 
      author={Peter Sushko and Ayana Bharadwaj and Zhi Yang Lim and Vasily Ilin and Ben Caffee and Dongping Chen and Mohammadreza Salehi and Cheng-Yu Hsieh and Ranjay Krishna},
      year={2025},
      eprint={2502.03629},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.03629}, 
}
```