Image-to-Image
Diffusers
Safetensors
English
model_hub_mixin
pytorch_model_hub_mixin
Instruct-CLIP / README.md
nielsr's picture
nielsr HF Staff
Improve model card: Fix pipeline tag, add library name and improve content
b8ef3ef verified
|
raw
history blame
2.79 kB
---
base_model:
- SherryXTChen/LatentDiffusionDINOv2
datasets:
- timbrooks/instructpix2pix-clip-filtered
- SherryXTChen/InstructCLIP-InstructPix2Pix-Data
language:
- en
license: apache-2.0
pipeline_tag: image-to-image
library_name: diffusers
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
---
# InstructCLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning (CVPR 2025)
This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration.
The model is based on the paper [Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning](https://huggingface.co/papers/2503.18406).
[Arxiv](http://arxiv.org/abs/2503.18406) | [Image Editing Model](https://huggingface.co/SherryXTChen/InstructCLIP-InstructPix2Pix) | [Data Refinement Model](https://huggingface.co/SherryXTChen/Instruct-CLIP) | [Data](https://huggingface.co/datasets/SherryXTChen/InstructCLIP-InstructPix2Pix-Data)
## Capabilities
<p align="center">
<img src="https://github.com/SherryXTChen/Instruct-CLIP/blob/main/assets/teaser_1.png" alt="Figure 1" width="43%">
<img src="https://github.com/SherryXTChen/Instruct-CLIP/blob/main/assets/teaser_2.png" alt="Figure 2" width="50%">
</p>
## Installation
```
pip install -r requirements.txt
```
## Inference
```python
import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler
model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.load_lora_weights("SherryXTChen/InstructCLIP-InstructPix2Pix")
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
url = "https://raw.githubusercontent.com/SherryXTChen/Instruct-CLIP/refs/heads/main/assets/1_input.jpg"
def download_image(url):
image = PIL.Image.open(requests.get(url, stream=True).raw)
image = PIL.ImageOps.exif_transpose(image)
image = image.convert("RGB")
return image
image = download_image(url)
prompt = "as a 3 d sculpture"
images = pipe(prompt, image=image, num_inference_steps=20).images
images[0].save("output.jpg")
```
## Citation
```bibtex
@misc{chen2025instructclipimprovinginstructionguidedimage,
title={Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning},
author={Sherry X. Chen and Misha Sra and Pradeep Sen},
year={2025},
eprint={2503.18406},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.18406},
}
```