Spaces:

leditsplusplus
/

project

Running

App Files Files Community

Linoy Tsaban commited on Nov 28, 2023

Commit

f1c4123

1 Parent(s): 6e7303e

Update index.html

Browse files

Files changed (1) hide show

index.html +15 -12

index.html CHANGED Viewed

@@ -207,7 +207,7 @@
             </p>
             <section class="section">
                 <div class="container is-max-desktop">
                     <div class="columns is-centered has-text-centered">
                         <img src="static/images/variations.png"
                              class="interpolation-image"
@@ -256,18 +256,16 @@
                     <div class="content">
                         <h2 class="title is-4">Component 1: Perfect Inversion</h2>
                         <p>
-                            Utilizing text-to-image models for editing real images is usually done by inverting the sampling process to
-                            identify a noisy xT that will be denoised to the input image x0.
-                            We propose an efficient inversion method that greatly reduces the required number
                             of steps while maintaining no reconstruction error.
-                            First, DDPM can be viewed as a first-order
-                            stochastic differential
-                            equation
-                            (SDE) solver when
-                            formulating the reverse diffusion process as an SDE. This
                             SDE can be solved more efficiently—in fewer steps—
-                            using a higher-order differential equation solver, hence we present here dpm-solver++
-                            Inversion.
                         </p>
@@ -300,7 +298,12 @@
                     <div class="columns is-centered">
                         <div class="column content">
                             <p>
-                                With LEDITS++, we empirically demonstrate that these maps can also capture regions 290
                                 of an image relevant to an editing concept that is not already present.
                                 Specifically for multiple edits, calculating a
                                 dedicated mask for each edit prompt ensures that the corresponding

             </p>
             <section class="section">
                 <div class="container is-max-desktop">
                     <div class="columns is-centered has-text-centered">
                         <img src="static/images/variations.png"
                              class="interpolation-image"
                     <div class="content">
                         <h2 class="title is-4">Component 1: Perfect Inversion</h2>
                         <p>
+                            Utilizing T2I models for editing real images is usually done by inverting the sampling
+                            process to identify a noisy xT that will be denoised to the input image x0.
+                            We draw characteristics from edit friendly DDPM inversion [] and propose an efficient
+                            inversion method that greatly reduces the required number
                             of steps while maintaining no reconstruction error.
+                            DDPM can be viewed as a first-order
+                            SDE solver when formulating the reverse diffusion process as an SDE. This
                             SDE can be solved more efficiently—in fewer steps—
+                            using a higher-order differential equation solver, hence we derive a new, faster
+                            technique - dpm-solver++ Inversion.
                         </p>
                     <div class="columns is-centered">
                         <div class="column content">
                             <p>
+                                In our defined LEDITS++ guidance, we include a masking term composed of the
+                                intersection between the mask generated from
+                                the U-Net’s cross-attention layers and a mask derived from
+                                the noise estimate - yielding a mask both focused on relevant image
+                                regions and of fine granularity.
+                                We empirically demonstrate that these maps can also capture regions 290
                                 of an image relevant to an editing concept that is not already present.
                                 Specifically for multiple edits, calculating a
                                 dedicated mask for each edit prompt ensures that the corresponding