File size: 2,312 Bytes
e00e682 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
---
license: mit
language:
- en
base_model:
- stable-diffusion-v1-5/stable-diffusion-v1-5
datasets:
- timbrooks/instructpix2pix-clip-filtered
- Aleksandar/Top-Bench-X
---
# EditCLIP: Representation Learning for Image Editing
[](https://arxiv.org/abs/2503.20318)
[](https://qianwangx.github.io/EditCLIP/)
[](https://github.com/QianWangX/EditCLIP)
[](https://iccv2025.thecvf.com/)
## ๐ก Abstract
We introduce EditCLIP, a novel representation-learning approach for image editing. Our method learns a unified representation of edits by jointly encoding an input image and its edited counterpart, effectively capturing their transformation. To evaluate its effectiveness, we employ EditCLIP to solve two tasks: exemplar-based image editing and automated edit evaluation. In exemplar-based image editing, we replace text-based instructions in InstructPix2Pix with EditCLIP embeddings computed from a reference exemplar image pair. Experiments demonstrate that our approach outperforms state-of-the-art methods while being more efficient and versatile. For automated evaluation, EditCLIP assesses image edits by measuring the similarity between the EditCLIP embedding of a given image pair and either a textual editing instruction or the EditCLIP embedding of another reference image pair. Experiments show that EditCLIP aligns more closely with human judgments than existing CLIP-based metrics, providing a reliable measure of edit quality and structural preservation.
## ๐ Benchmark
We evaluate EditCLIP using **Top-Bench-X**, a benchmark for image editing evaluation:
- **Dataset:** Top-Bench-X
- **Link:** https://huggingface.co/datasets/Aleksandar/Top-Bench-X
## ๐ Citation
```bibtex
@inproceedings{wang2025editclip,
title={EditCLIP: Representation Learning for Image Editing},
author={Wang, Qian and Cveji{\'c}, Aleksandar and Eldesokey, Abdelrahman and Wonka, Peter},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={15960--15970},
year={2025}
}
``` |