QWW
/

EditCLIP

Model card Files Files and versions

EditCLIP / README.md

Aleksandar's picture

Update README.md

e00e682 verified 8 days ago

|

2.31 kB

	---
	license: mit
	language:
	- en
	base_model:
	- stable-diffusion-v1-5/stable-diffusion-v1-5
	datasets:
	- timbrooks/instructpix2pix-clip-filtered
	- Aleksandar/Top-Bench-X
	---
	# EditCLIP: Representation Learning for Image Editing
	[![Paper](https://img.shields.io/badge/arXiv-2503.20318-b31b1b)](https://arxiv.org/abs/2503.20318)
	[![Project Page](https://img.shields.io/badge/🌐-Project_Page-blue)](https://qianwangx.github.io/EditCLIP/)
	[![GitHub](https://img.shields.io/badge/GitHub-Repository-black?logo=github)](https://github.com/QianWangX/EditCLIP)
	[![ICCV 2025](https://img.shields.io/badge/📷-Published_at_ICCV_2025-blue)](https://iccv2025.thecvf.com/)

	## 💡 Abstract

	We introduce EditCLIP, a novel representation-learning approach for image editing. Our method learns a unified representation of edits by jointly encoding an input image and its edited counterpart, effectively capturing their transformation. To evaluate its effectiveness, we employ EditCLIP to solve two tasks: exemplar-based image editing and automated edit evaluation. In exemplar-based image editing, we replace text-based instructions in InstructPix2Pix with EditCLIP embeddings computed from a reference exemplar image pair. Experiments demonstrate that our approach outperforms state-of-the-art methods while being more efficient and versatile. For automated evaluation, EditCLIP assesses image edits by measuring the similarity between the EditCLIP embedding of a given image pair and either a textual editing instruction or the EditCLIP embedding of another reference image pair. Experiments show that EditCLIP aligns more closely with human judgments than existing CLIP-based metrics, providing a reliable measure of edit quality and structural preservation.

	## 📊 Benchmark
	We evaluate EditCLIP using Top-Bench-X, a benchmark for image editing evaluation:
	- Dataset: Top-Bench-X
	- Link: https://huggingface.co/datasets/Aleksandar/Top-Bench-X


	## 🌟 Citation
	```bibtex
	@inproceedings{wang2025editclip,
	title={EditCLIP: Representation Learning for Image Editing},
	author={Wang, Qian and Cveji{\'c}, Aleksandar and Eldesokey, Abdelrahman and Wonka, Peter},
	booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
	pages={15960--15970},
	year={2025}
	}
	```