Spaces:
Running
Running
Linoy Tsaban
commited on
Commit
·
2412a2b
1
Parent(s):
086a6c0
Update index.html
Browse files- index.html +42 -6
index.html
CHANGED
|
@@ -208,9 +208,21 @@
|
|
| 208 |
<section class="section">
|
| 209 |
<div class="container is-max-desktop">
|
| 210 |
<div class="columns is-centered has-text-centered">
|
| 211 |
-
<img src="static/images/
|
| 212 |
class="interpolation-image"
|
| 213 |
-
style="max-height:800px; max-width:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 214 |
alt="examples"/>
|
| 215 |
</div>
|
| 216 |
</div>
|
|
@@ -221,13 +233,12 @@
|
|
| 221 |
</div>
|
| 222 |
<p>
|
| 223 |
The methodology of LEDITS++ can be broken down into three components: (1) efficient image
|
| 224 |
-
inversion, (2) versatile textual editing, and (3) semantic grounding of image changes.
|
| 225 |
-
details and mathematical derivations of each component can be found in App
|
| 226 |
</p>
|
| 227 |
|
| 228 |
<div class="columns is-centered has-text-centered">
|
| 229 |
-
<img src="static/images/
|
| 230 |
-
style="max-height:620px; max-width:
|
| 231 |
alt="diagram"/>
|
| 232 |
</div>
|
| 233 |
|
|
@@ -250,6 +261,18 @@
|
|
| 250 |
<div class="content">
|
| 251 |
<h2 class="title is-4">Component 1: Perfect Inversion</h2>
|
| 252 |
<p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 253 |
|
| 254 |
</p>
|
| 255 |
|
|
@@ -262,6 +285,13 @@
|
|
| 262 |
<div class="columns is-centered">
|
| 263 |
<div class="column content">
|
| 264 |
<p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 265 |
|
| 266 |
</p>
|
| 267 |
|
|
@@ -275,6 +305,12 @@
|
|
| 275 |
<div class="columns is-centered">
|
| 276 |
<div class="column content">
|
| 277 |
<p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 278 |
|
| 279 |
</p>
|
| 280 |
|
|
|
|
| 208 |
<section class="section">
|
| 209 |
<div class="container is-max-desktop">
|
| 210 |
<div class="columns is-centered has-text-centered">
|
| 211 |
+
<img src="static/images/removal.png"
|
| 212 |
class="interpolation-image"
|
| 213 |
+
style="max-height:800px; max-width:800px"
|
| 214 |
+
alt="examples"/>
|
| 215 |
+
</div>
|
| 216 |
+
<div class="columns is-centered has-text-centered">
|
| 217 |
+
<img src="static/images/replacement.png"
|
| 218 |
+
class="interpolation-image"
|
| 219 |
+
style="max-height:800px; max-width:800px"
|
| 220 |
+
alt="examples"/>
|
| 221 |
+
</div>
|
| 222 |
+
<div class="columns is-centered has-text-centered">
|
| 223 |
+
<img src="static/images/style_transfer.png"
|
| 224 |
+
class="interpolation-image"
|
| 225 |
+
style="max-height:800px; max-width:800px"
|
| 226 |
alt="examples"/>
|
| 227 |
</div>
|
| 228 |
</div>
|
|
|
|
| 233 |
</div>
|
| 234 |
<p>
|
| 235 |
The methodology of LEDITS++ can be broken down into three components: (1) efficient image
|
| 236 |
+
inversion, (2) versatile textual editing, and (3) semantic grounding of image changes.
|
|
|
|
| 237 |
</p>
|
| 238 |
|
| 239 |
<div class="columns is-centered has-text-centered">
|
| 240 |
+
<img src="static/images/ledits_teaser.jpg"
|
| 241 |
+
style="max-height:620px; max-width:1000px"
|
| 242 |
alt="diagram"/>
|
| 243 |
</div>
|
| 244 |
|
|
|
|
| 261 |
<div class="content">
|
| 262 |
<h2 class="title is-4">Component 1: Perfect Inversion</h2>
|
| 263 |
<p>
|
| 264 |
+
Utilizing text-to-image models for editing real images is usually done by inverting the sampling process to
|
| 265 |
+
identify a noisy xT that will be denoised to the input image x0.
|
| 266 |
+
We propose an efficient inversion method that greatly reduces the required number
|
| 267 |
+
of steps while maintaining no reconstruction error.
|
| 268 |
+
First, DDPM can be viewed as a first-order
|
| 269 |
+
stochastic differential
|
| 270 |
+
equation
|
| 271 |
+
(SDE) solver when
|
| 272 |
+
formulating the reverse diffusion process as an SDE. This
|
| 273 |
+
SDE can be solved more efficiently—in fewer steps—
|
| 274 |
+
using a higher-order differential equation solver, hence we present here dpm-solver++
|
| 275 |
+
Inversion.
|
| 276 |
|
| 277 |
</p>
|
| 278 |
|
|
|
|
| 285 |
<div class="columns is-centered">
|
| 286 |
<div class="column content">
|
| 287 |
<p>
|
| 288 |
+
After creating our re-construction sequence, we can edit the image by manipulating
|
| 289 |
+
the noise estimate εθ based on a set of edit instructions. We devise a dedicated
|
| 290 |
+
guidance term for each concept based on conditioned and unconditioned estimate. We
|
| 291 |
+
define LEDITS++ guidance such that it both reflects the direction of the edit (if we
|
| 292 |
+
want
|
| 293 |
+
to push away from/towards the edit concept) and maximizes fine-grained control over
|
| 294 |
+
the effect of the desired edit
|
| 295 |
|
| 296 |
</p>
|
| 297 |
|
|
|
|
| 305 |
<div class="columns is-centered">
|
| 306 |
<div class="column content">
|
| 307 |
<p>
|
| 308 |
+
With LEDITS++, we empirically demonstrate that these maps can also capture regions 290
|
| 309 |
+
of an image relevant to an editing concept that is not already present.
|
| 310 |
+
Specifically for multiple edits, calculating a
|
| 311 |
+
dedicated mask for each edit prompt ensures that the corresponding
|
| 312 |
+
guidance terms remain largely isolated, limiting
|
| 313 |
+
interference between them.
|
| 314 |
|
| 315 |
</p>
|
| 316 |
|