omeregev commited on
Commit ·
97ac306
1
Parent(s): eef0ef6
Upload AlphaCLIP weights, License and README
Browse files- LICENSE +50 -0
- README.md +94 -5
- clip_l14_336_grit1m_fultune_8xe.pth +3 -0
LICENSE
CHANGED
|
@@ -0,0 +1,50 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Licensing
|
| 2 |
+
|
| 3 |
+
Copyright (c) 2024 Omer Regev
|
| 4 |
+
|
| 5 |
+
Licensed under the MIT License (the "License") and the Commons
|
| 6 |
+
Clause Restriction; you may not use this file except in compliance with the
|
| 7 |
+
License and the Commons Clause Restriction.
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
MIT License
|
| 11 |
+
|
| 12 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 13 |
+
of this software and associated documentation files (the "Software"), to deal
|
| 14 |
+
in the Software without restriction, including without limitation the rights
|
| 15 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
| 16 |
+
copies of the Software, and to permit persons to whom the Software is
|
| 17 |
+
furnished to do so, subject to the following conditions:
|
| 18 |
+
|
| 19 |
+
The above copyright notice and this permission notice shall be included in all
|
| 20 |
+
copies or substantial portions of the Software.
|
| 21 |
+
|
| 22 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
| 23 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
| 24 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
| 25 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
| 26 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
| 27 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
| 28 |
+
SOFTWARE.
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
Commons Clause Restriction
|
| 32 |
+
“Commons Clause” License Condition v1.0
|
| 33 |
+
|
| 34 |
+
The Software is provided to you by the Licensor under the License, as defined below,
|
| 35 |
+
subject to the following condition. Without limiting other conditions in the License,
|
| 36 |
+
the grant of rights under the License will not include, and the License does not grant to you,
|
| 37 |
+
the right to Sell the Software. For purposes of the foregoing, “Sell” means practicing
|
| 38 |
+
any or all of the rights granted to you under the License to provide to third parties,
|
| 39 |
+
for a fee or other consideration (including without limitation fees for hosting or
|
| 40 |
+
consulting/ support services related to the Software), a product or service whose value derives,
|
| 41 |
+
entirely or substantially, from the functionality of the Software.
|
| 42 |
+
Any license notice or attribution required by the License must also include this
|
| 43 |
+
Commons Clause License Condition notice.
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
|
README.md
CHANGED
|
@@ -1,5 +1,94 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# [AAAI 2025] Click2Mask: Local Editing with Dynamic Mask Generation
|
| 2 |
+
|
| 3 |
+
Official PyTorch Implementation for
|
| 4 |
+
["Click2Mask: Local Editing with Dynamic Mask Generation"](https://omeregev.github.io/click2mask/).
|
| 5 |
+
|
| 6 |
+
[](https://omeregev.github.io/click2mask/)
|
| 7 |
+
[](https://arxiv.org/abs/2409.08272)
|
| 8 |
+
[](https://omeregev.github.io/click2mask/static/paper/Click2Mask.pdf)
|
| 9 |
+
[](https://github.com/omeregev/click2mask)[](https://github.com/omeregev/click2mask)
|
| 10 |
+
[](https://youtu.be/A0ZEVTm9SLw?si=_coDIWRXa8Wo-2na)
|
| 11 |
+
[](https://huggingface.co/spaces/omeregev/click2mask)
|
| 12 |
+
[](https://colab.research.google.com/github/omeregev/click2mask/blob/main/demo.ipynb)
|
| 13 |
+
<br><br>
|
| 14 |
+
|
| 15 |
+
<img src="imgs/teaser.gif"/>
|
| 16 |
+
|
| 17 |
+
<a href="https://omeregev.github.io/click2mask/">**[AAAI 2025] Click2Mask: Local Editing with Dynamic Mask Generation**</a>
|
| 18 |
+
<br>
|
| 19 |
+
<a href="https://www.linkedin.com/in/omeregev/">Omer Regev</a>,
|
| 20 |
+
<a href="https://omriavrahami.com/">Omri Avrahami</a>,
|
| 21 |
+
<a href="https://www.cs.huji.ac.il/~danix/">Dani Lischinski</a>
|
| 22 |
+
|
| 23 |
+
Given an image, a <span style="white-space: nowrap;">
|
| 24 |
+
<b>Click</b> <img src="imgs/point.png" alt="alt text" width="10" style="margin-right: 2px;">
|
| 25 |
+
</span>, and a prompt for an added object, a **Mask** is generated dynamically,
|
| 26 |
+
simultaneously with the object generation throughout the diffusion process.
|
| 27 |
+
|
| 28 |
+
Current methods rely on existing objects/segments, or user effort (masks/detailed text),
|
| 29 |
+
to localize object additions. Our approach enables free-form editing,
|
| 30 |
+
where the manipulated area is not well-defined, using just a <span style="white-space: nowrap;">
|
| 31 |
+
<b>Click</b> <img src="imgs/point.png" alt="alt text" width="10" style="margin-right: 2px;">
|
| 32 |
+
</span> for localization.
|
| 33 |
+
|
| 34 |
+
## 🚀 New! Try Click2Mask Online
|
| 35 |
+
|
| 36 |
+
### 🤗 Hugging Face Demo
|
| 37 |
+
Try it instantly in your browser - no setup required.
|
| 38 |
+
[**Launch Demo →**](https://huggingface.co/spaces/omeregev/click2mask)
|
| 39 |
+
|
| 40 |
+
### <img src="https://colab.research.google.com/img/colab_favicon_256px.png" width="25" height="25" align="center"> Google Colab Demo
|
| 41 |
+
Includes both Gradio interface and command line for advanced usage.
|
| 42 |
+
[**Open in Colab →**](https://colab.research.google.com/github/omeregev/click2mask/blob/main/demo.ipynb)
|
| 43 |
+
|
| 44 |
+
**For additional usage options** - please refer to our [**repository**](https://github.com/omeregev/click2mask).
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
## Results
|
| 48 |
+
|
| 49 |
+
### Output Examples
|
| 50 |
+
Each example includes an input image with a <span style="white-space: nowrap;">
|
| 51 |
+
<b>Click</b> <img src="imgs/point.png" alt="alt text" width="10" style="margin-right: 2px;">
|
| 52 |
+
</span>,
|
| 53 |
+
followed by outputs corresponding to the prompts below.
|
| 54 |
+
<img src="imgs/results.jpg">
|
| 55 |
+
|
| 56 |
+
### Qualitative Comparisons with SoTA Methods
|
| 57 |
+
A brief glimpse into the qualitative comparison between the SoTA methods —
|
| 58 |
+
[Emu Edit](https://emu-edit.metademolab.com),
|
| 59 |
+
[MagicBrush](https://osu-nlp-group.github.io/MagicBrush/)
|
| 60 |
+
and [InstructPix2Pix](https://timothybrooks.com/instruct-pix2pix)
|
| 61 |
+
— against our model, [**Click2Mask**](https://omeregev.github.io/click2mask/).
|
| 62 |
+
<br>
|
| 63 |
+
Upper prompts were given to baselines, and lower (shorter) ones to **Click2Mask**.
|
| 64 |
+
Inputs contain the <span style="white-space: nowrap;">
|
| 65 |
+
<b>Click</b> <img src="imgs/point.png" alt="alt text" width="10" style="margin-right: 2px;">
|
| 66 |
+
</span> given to **Click2Mask**.
|
| 67 |
+
<img src="imgs/compare.png">
|
| 68 |
+
|
| 69 |
+
## Evaluating Edited Regions in Maskless Methods
|
| 70 |
+
We introduce **Edited Alpha-CLIP** to evaluate mask-free methods by extracting a mask of the edited region
|
| 71 |
+
and using [Alpha-CLIP](https://aleafy.github.io/alpha-clip/) to assess its alignment with the prompt.
|
| 72 |
+
<br>
|
| 73 |
+
Examples of mask extractions: outputs are on the left, extracted masks (green overlay) on the right.
|
| 74 |
+
<img src="imgs/edited_alphaclip.png">
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
## Citation
|
| 78 |
+
If you find this helpful for your research, please reference the following:
|
| 79 |
+
```bibtex
|
| 80 |
+
@misc{regev2024click2masklocaleditingdynamic,
|
| 81 |
+
title={Click2Mask: Local Editing with Dynamic Mask Generation},
|
| 82 |
+
author={Omer Regev and Omri Avrahami and Dani Lischinski},
|
| 83 |
+
year={2024},
|
| 84 |
+
eprint={2409.08272},
|
| 85 |
+
archivePrefix={arXiv},
|
| 86 |
+
primaryClass={cs.CV},
|
| 87 |
+
url={https://arxiv.org/abs/2409.08272},
|
| 88 |
+
}
|
| 89 |
+
```
|
| 90 |
+
|
| 91 |
+
## Acknowledgements
|
| 92 |
+
This code is based on
|
| 93 |
+
[Blended Latent Diffusion](https://github.com/omriav/blended-latent-diffusion/tree/master)
|
| 94 |
+
and on [Stable Diffusion](https://github.com/CompVis/stable-diffusion).
|
clip_l14_336_grit1m_fultune_8xe.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fa701cf18a3e19a1e5e81668885b15bf75337e59b0788405999e889b7aa775d0
|
| 3 |
+
size 1218070895
|