File size: 2,588 Bytes
63039b2
 
dd301c5
 
63039b2
 
 
 
dd301c5
 
63039b2
 
 
 
 
dd301c5
63039b2
dd301c5
 
 
 
 
63039b2
dd301c5
63039b2
dd301c5
 
 
 
63039b2
 
 
 
 
dd301c5
63039b2
 
 
 
 
 
 
 
 
 
 
 
 
dd301c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
library_name: diffusers
license: cc-by-4.0
pipeline_tag: image-to-image
---

# Model Card for Model ID

This is the model card of a 🧨 diffusers model that has been pushed on the Hub and presented in the paper [RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions](https://huggingface.co/papers/2506.03448).
It can be used for image-to-image editing, based on instructions.

## Model Details

### Model Description

This is the model card of a 🧨 diffusers model that has been automatically generated.

- **Developed by:** Bimsara Pathiraja, Maitreya Patel, Shivam Singh, Yezhou Yang, Chitta Baral
- **Model type:** Diffusers
- **Language(s) (NLP):** English
- **License:** CC-BY-4.0
- **Finetuned from model:** InstructPix2Pix, UltraEdit-freeform

### Model Sources

- **Repository:** [https://huggingface.co/bpathir1/RefEdit-SD3](https://huggingface.co/bpathir1/RefEdit-SD3)
- **Paper:** [RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions](https://huggingface.co/papers/2506.03448)
- **Project Page:** [https://refedit.vercel.app](https://refedit.vercel.app)
- **Github Repository:** [https://github.com/OSU-NLP-Group/RefEdit/](https://github.com/OSU-NLP-Group/RefEdit/)

## Uses

### Direct Use

Can be used for image editing.

### Out-of-Scope Use

[More Information Needed]

## Bias, Risks, and Limitations

[More Information Needed]

## How to Get Started with the Model

Use the code below to get started with the model.

```python
# For Editing with RefEdit-SD3
import torch
from diffusers import StableDiffusion3InstructPix2PixPipeline
from diffusers.utils import load_image
import requests
import PIL.Image
import PIL.ImageOps

pipe = StableDiffusion3InstructPix2PixPipeline.from_pretrained("bpathir1/RefEdit-SD3", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "Add a flower bunch to the person with a red jacket"
img = load_image("RefEdit/imgs/person_with_red_jacket.jpg").resize((512, 512))

image = pipe(
    prompt,
    image=img,
    mask_img=None,
    num_inference_steps=50,
    image_guidance_scale=1.5,
    guidance_scale=7.5,
).images[0]

image.save("RefEdit/imgs/edited_image.png")
```

## Citation

```
@article{pathiraja2025refedit,
    title={RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring Expression},
    author={Pathiraja, Bimsara and Patel, Maitreya and Singh, Shivam and Yang, Yezhou and Baral, Chitta},
    journal={arXiv preprint arXiv:2506.03448},
    year={2025}
}
```