|
|
--- |
|
|
library_name: diffusers |
|
|
license: cc-by-4.0 |
|
|
pipeline_tag: image-to-image |
|
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
|
|
This is the model card of a 🧨 diffusers model that has been pushed on the Hub and presented in the paper [RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions](https://huggingface.co/papers/2506.03448). |
|
|
It can be used for image-to-image editing, based on instructions. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This is the model card of a 🧨 diffusers model that has been automatically generated. |
|
|
|
|
|
- **Developed by:** Bimsara Pathiraja, Maitreya Patel, Shivam Singh, Yezhou Yang, Chitta Baral |
|
|
- **Model type:** Diffusers |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** CC-BY-4.0 |
|
|
- **Finetuned from model:** InstructPix2Pix, UltraEdit-freeform |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** [https://huggingface.co/bpathir1/RefEdit-SD3](https://huggingface.co/bpathir1/RefEdit-SD3) |
|
|
- **Paper:** [RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions](https://huggingface.co/papers/2506.03448) |
|
|
- **Project Page:** [https://refedit.vercel.app](https://refedit.vercel.app) |
|
|
- **Github Repository:** [https://github.com/OSU-NLP-Group/RefEdit/](https://github.com/OSU-NLP-Group/RefEdit/) |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
Can be used for image editing. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
[More Information Needed] |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
[More Information Needed] |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Use the code below to get started with the model. |
|
|
|
|
|
```python |
|
|
# For Editing with RefEdit-SD3 |
|
|
import torch |
|
|
from diffusers import StableDiffusion3InstructPix2PixPipeline |
|
|
from diffusers.utils import load_image |
|
|
import requests |
|
|
import PIL.Image |
|
|
import PIL.ImageOps |
|
|
|
|
|
pipe = StableDiffusion3InstructPix2PixPipeline.from_pretrained("bpathir1/RefEdit-SD3", torch_dtype=torch.float16) |
|
|
pipe = pipe.to("cuda") |
|
|
prompt = "Add a flower bunch to the person with a red jacket" |
|
|
img = load_image("RefEdit/imgs/person_with_red_jacket.jpg").resize((512, 512)) |
|
|
|
|
|
image = pipe( |
|
|
prompt, |
|
|
image=img, |
|
|
mask_img=None, |
|
|
num_inference_steps=50, |
|
|
image_guidance_scale=1.5, |
|
|
guidance_scale=7.5, |
|
|
).images[0] |
|
|
|
|
|
image.save("RefEdit/imgs/edited_image.png") |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
``` |
|
|
@article{pathiraja2025refedit, |
|
|
title={RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring Expression}, |
|
|
author={Pathiraja, Bimsara and Patel, Maitreya and Singh, Shivam and Yang, Yezhou and Baral, Chitta}, |
|
|
journal={arXiv preprint arXiv:2506.03448}, |
|
|
year={2025} |
|
|
} |
|
|
``` |