File size: 2,892 Bytes
be4c430 34da673 be4c430 34da673 2d2059a 34da673 be4c430 076d26b af62223 34da673 076d26b 34da673 1fa0058 34da673 076d26b 34da673 1fa0058 be4c430 253189c be4c430 253189c be4c430 253189c be4c430 253189c be4c430 253189c be4c430 1fa0058 be4c430 253189c be4c430 1fa0058 be4c430 253189c be4c430 253189c be4c430 34da673 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
---
license: cc
tags:
- image-to-image
---
# REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Project page: https://peter-sushko.github.io/RealEdit/
Data: https://huggingface.co/datasets/peter-sushko/RealEdit
<img src="https://peter-sushko.github.io/RealEdit/static/images/teaser.svg"/>
**There are 2 ways to run inference: either via Diffusers or original InstructPix2Pix pipeline.**
## Option 1: With 🧨Diffusers:
Install diffusers, transformers library:
```bash
pip install diffusers accelerate safetensors transformers
```
Download weights adapted for diffusers:
WEIGHTS
```python
import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler
model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, safety_checker=None)
CODE TO RUN IT:
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
url = "https://raw.githubusercontent.com/timothybrooks/instruct-pix2pix/main/imgs/example.jpg"
def download_image(url):
image = PIL.Image.open(requests.get(url, stream=True).raw)
image = PIL.ImageOps.exif_transpose(image)
image = image.convert("RGB")
return image
image = download_image(url)
prompt = "turn him into cyborg"
images = pipe(prompt, image=image, num_inference_steps=10, image_guidance_scale=1).images
images[0]
```
## Option 2: via InstructPix2Pix pipeline:
Clone the repository and set up the directory structure:
```bash
git clone https://github.com/timothybrooks/instruct-pix2pix.git
cd instruct-pix2pix
mkdir checkpoints
```
Download the fine-tuned checkpoint into the `checkpoints` directory:
```bash
cd checkpoints
# wget https://huggingface.co/peter-sushko/RealEdit/resolve/main/realedit_model.ckpt
```
Return to the repo root and follow the [InstructPix2Pix installation guide](https://github.com/timothybrooks/instruct-pix2pix) to set up the environment.
Edit a single image
```bash
python edit_cli.py \
--input [YOUR_IMG_PATH] \
--output imgs/output.jpg \
--edit "YOUR EDIT INSTRUCTION" \
--ckpt checkpoints/realedit_model.ckpt
```
Launch the Gradio interface
```bash
python edit_app.py --ckpt checkpoints/realedit_model.ckpt
```
## Citation
If you find this checkpoint helpful, please cite:
```
@misc{sushko2025realeditredditeditslargescale,
title={REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations},
author={Peter Sushko and Ayana Bharadwaj and Zhi Yang Lim and Vasily Ilin and Ben Caffee and Dongping Chen and Mohammadreza Salehi and Cheng-Yu Hsieh and Ranjay Krishna},
year={2025},
eprint={2502.03629},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.03629},
}
``` |