Buckets:
Title: SAR Despeckling using a Denoising Diffusion Probabilistic Model
URL Source: https://arxiv.org/html/2206.04514
Markdown Content: Malsha V. Perera, , Nithin Gopalakrishnan Nair, , Wele Gedara Chaminda Bandara, and Vishal M. Patel This work was supported by the NSF CAREER Award under Grant 2045489. M. V. Perera, N. G. Nair, W. G. C. Bandara, and V. M. Patel are with the Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, 21218. (e-mail: {jperera4, ngopala2, wbandar1, and vpatel36}@jhu.edu).
Abstract
Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we introduce SAR-DDPM, a denoising diffusion probabilistic model for SAR despeckling. The proposed method comprises of a Markov chain that transforms clean images to white Gaussian noise by repeatedly adding random noise. The despeckled image is recovered by a reverse process which iteratively predicts the added noise using a noise predictor which is conditioned on the speckled image. In addition, we propose a new inference strategy based on cycle spinning to improve the despeckling performance. Our experiments on both synthetic and real SAR images demonstrate that the proposed method achieves significant improvements in both quantitative and qualitative results over the state-of-the-art despeckling methods. The code is available at: https://github.com/malshaV/SAR_DDPM
Index Terms:
Synthetic Aperture Radar, diffusion models, speckle, denoising
I Introduction
Synthetic Aperture Radar (SAR) is a coherent imaging modality which is strongly invariant to different environmental conditions. Therefore, SAR is widely used in applications such as landscape classification, disaster monitoring, and surface change detection. Nonetheless, SAR images are often affected by speckle which is a signal-dependent, spatially correlated noise. While degrading the SAR images, the presence of speckle creates an adverse effect on downstream tasks. Hence, several methods for speckle removal have been proposed in the literature for enhancing the SAR images over the past decades.
For a given SAR intensity Y 𝑌 Y italic_Y, the mathematical model of multiplicative speckle noise N 𝑁 N italic_N can be expressed as follows:
Y=XN,𝑌 𝑋 𝑁 Y=XN,italic_Y = italic_X italic_N ,(1)
where X 𝑋 X italic_X is the speckle-free or the clean image. Generally, it is assumed that N 𝑁 N italic_N follows a Gamma distribution with unit mean and variance of 1/L 1 𝐿 1/L 1 / italic_L, with the following probability density function:
p(N)=1 Γ(N)L LN L−1e−LN,𝑝 𝑁 1 Γ 𝑁 superscript 𝐿 𝐿 superscript 𝑁 𝐿 1 superscript 𝑒 𝐿 𝑁 p(N)=\frac{1}{\Gamma(N)}L^{L}N^{L-1}e^{-LN},italic_p ( italic_N ) = divide start_ARG 1 end_ARG start_ARG roman_Γ ( italic_N ) end_ARG italic_L start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT italic_L - 1 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - italic_L italic_N end_POSTSUPERSCRIPT ,(2)
where Γ(.)\Gamma(.)roman_Γ ( . ) is the Gamma function and L 𝐿 L italic_L is the number of looks in multilook processing.
The first attempts at despeckling employed spatial domain filtering-based approaches such as Lee filter [1], Kuan filter [2] and Gamma maximum a posteriori (MAP) filter [3]. These approaches generally use the spatial correlation of image to filter noise by employing a sliding window to compute the pixel value at the center of the window. However, these methods result in intense smoothing and most edges and texture details in the SAR images are lost. In 2009, Deledalle et al. proposed a probabilistic patch-based (PPB) [4] filtering method which combines non-local means filtering and 3-D block matching approach (BM3D) [5]. Parrilli et al.[6] introduced SAR-BM3D for SAR despeckling by extending BM3D. Sparse dictionary-based despeckling methods have also been proposed in the literature [7]. A detailed review of SAR despeckling approaches can be found in [8] and [9].
More recently, deep learning algorithms have gained popularity and achieved state-of-the-art performance in wide range of computer vision and image processing tasks. Following this trend, several studies have attempted to apply deep networks for SAR despeckling. SAR-CNN [10] is one of the earliest Convolutional Neural Network (CNN) based despeckling methods. This method transforms the multiplicative speckle noise to additive noise applying a homomorphic transformation to the SAR images. In order to train the network, SAR-CNN uses multi-temporal fusion to generate clean reference images. Wang et al.[11] proposed ID-CNN which is trained on synthetically speckled optical images. ID-CNN uses a residual network architecture, which obtains despeckled image by dividing the input speckled image by the estimated speckle. ID-GAN proposed in [12], uses a Generative Adversarial Network (GAN) [13] for despeckling which is trained using a combination of Euclidean loss, Perceptual loss and Adversarial loss. In addition to training deep learning models using spatial loss functions, the Multi-Objective network (MONet) [14] incorporates a loss function that computes Kullback–Leibler divergence between the the predicted and simulated speckle probability distribution. Studies such as multiscale residual dense dual attention network (MRDDANet) [15] and SAR-CAM [16] incorporate various attention modules in the network architecture for despeckling. In [17], an overcomplete CNN architecture (SAR-ON) which focuses on learning low-level features by restricting the receptive field is used for SAR despeckling. Apart from CNN architectures, [18] introduces a Transformer-based architecture (Trans-SAR) for despeckling of SAR images.
Recently, Denoising Diffusion Probabilistic Models (DDPM) [19] have emerged as an alternative approach for generative modelling. Dhariwal and Nichol [20] successfully presented that DDPM can outperform the current state-of-the-art GAN-based generative models [13] in image synthesis. DDPM is parameterized by a Markov chain that gradually adds noise to the data until the signal is destroyed. During inference, a sample belonging to the training distribution can be generated by starting with a randomly sampled Gaussian noise and iteratively applying a reverse diffusion step. DDPM is trained by optimizing the variational lower bound of the negative log-likelihood of the data, and it avoids the mode collapse often encountered by GANs. With the success in image synthesis tasks, DDPMs have been adopted into a variety of vision tasks such as super-resolution [21], in-painting [22] and image restoration [23]. To the best of our knowledge, SAR despeckling based on Denoising Diffusion Probabilistic Models has not been studied in the literature.
Figure 1: Overview of the forward and reverse diffusion process
Inspired by these studies, we propose SAR-DDPM, a Denoising Diffusion Probabilistic Model-based approach for SAR despeckling. In our approach, we iteratively recover the clean image by employing a noise predictor conditioned on the speckled image. We train the proposed DDPM model with synthetically speckled optical images and test our approach on both synthetic and real SAR images. During inference, we incorporate a novel strategy based on cycle spinning [24] to improve the despeckling performance. Finally, we demonstrate the effectiveness of our method on synthetic and real SAR images by comparing with several recent despeckling approaches.
II Proposed Method
Figure 2: Overview of the conditional noise predictor network architecture in SAR-DDPM. Here, 2D-Convolution block, and residual block are denoted as “Conv Block”, and “Res Block”, respectively.
II-A Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models define a Markov chain that transforms an image 𝐱 0 subscript 𝐱 0\mathbf{x}{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to white Gaussian Noise 𝐱 T∼𝒩(0,1)similar-to subscript 𝐱 𝑇 𝒩 0 1\mathbf{x}{T}\sim\mathcal{N}(0,1)bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ∼ caligraphic_N ( 0 , 1 ) by adding random noise in T 𝑇 T italic_T diffusion time steps. During inference, a random noise vector 𝐱 T subscript 𝐱 𝑇\mathbf{x}{T}bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT is sampled and gradually denoised until it reaches the desired image 𝐱 0 subscript 𝐱 0\mathbf{x}{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. As illustrated in Fig. 1, DDPM comprises of two processes: a forward diffusion process and a reverse diffusion process.
In the forward diffusion process, 𝐱 0∼q(𝐱)similar-to subscript 𝐱 0 𝑞 𝐱\mathbf{x}{0}\sim q(\mathbf{x})bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∼ italic_q ( bold_x ) sampled from the real data distribution is converted to 𝐱 T subscript 𝐱 𝑇\mathbf{x}{T}bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT by gradually adding Gaussian noise ϵ italic-ϵ\epsilon italic_ϵ according to a variance schedule β 1,⋯,β T subscript 𝛽 1⋯subscript 𝛽 𝑇\beta_{1},\cdots,\beta_{T}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_β start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT
q(𝐱 t|𝐱 t−1)=𝒩(𝐱 t;1−β t𝐱 t−1,β T𝐈),𝑞 conditional subscript 𝐱 𝑡 subscript 𝐱 𝑡 1 𝒩 subscript 𝐱 𝑡 1 subscript 𝛽 𝑡 subscript 𝐱 𝑡 1 subscript 𝛽 𝑇 𝐈 q(\mathbf{x}{t}|\mathbf{x}{t-1})=\mathcal{N}(\mathbf{x}{t};\sqrt{1-\beta{t% }}\mathbf{x}{t-1},\beta{T}\mathbf{I}),\ italic_q ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ) = caligraphic_N ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ; square-root start_ARG 1 - italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT bold_I ) ,(3)
q(𝐱 1:T|𝐱 0)=∏t=1 T q(𝐱 t|𝐱 t−1).𝑞 conditional subscript 𝐱:1 𝑇 subscript 𝐱 0 superscript subscript product 𝑡 1 𝑇 𝑞 conditional subscript 𝐱 𝑡 subscript 𝐱 𝑡 1 q(\mathbf{x}{1:T}|\mathbf{x}{0})=\prod_{t=1}^{T}q(\mathbf{x}{t}|\mathbf{x}% {t-1}).italic_q ( bold_x start_POSTSUBSCRIPT 1 : italic_T end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_q ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ) .(4)
In the forward process, 𝐱 t subscript 𝐱 𝑡\mathbf{x}{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT can be sampled at any arbitrary time step t 𝑡 t italic_t by setting α t=1−β t subscript 𝛼 𝑡 1 subscript 𝛽 𝑡\alpha{t}=1-\beta_{t}italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = 1 - italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and α t¯=∏t=1 T α i¯subscript 𝛼 𝑡 superscript subscript product 𝑡 1 𝑇 subscript 𝛼 𝑖\bar{\alpha_{t}}=\prod_{t=1}^{T}\alpha_{i}over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG = ∏ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
q(𝐱 t|𝐱 0)=𝒩(𝐱 t;α t¯𝐱 0,(1−α t¯)𝐈).𝑞 conditional subscript 𝐱 𝑡 subscript 𝐱 0 𝒩 subscript 𝐱 𝑡¯subscript 𝛼 𝑡 subscript 𝐱 0 1¯subscript 𝛼 𝑡 𝐈 q(\mathbf{x}{t}|\mathbf{x}{0})=\mathcal{N}(\mathbf{x}{t};\bar{\alpha{t}}% \mathbf{x}{0},(1-\bar{\alpha{t}})\mathbf{I}).italic_q ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = caligraphic_N ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ; over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( 1 - over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ) bold_I ) .(5)
This can be further reparameterized as follows
𝐱 𝐭=α t¯𝐱 0+1−α t¯ϵ,ϵ∼𝒩(0,1).formulae-sequence 𝐱 𝐭¯subscript 𝛼 𝑡 subscript 𝐱 0 1¯subscript 𝛼 𝑡 italic-ϵ similar-to italic-ϵ 𝒩 0 1\mathbf{x_{t}}=\sqrt{\bar{\alpha_{t}}}\mathbf{x}{0}+\sqrt{1-\bar{\alpha{t}}}% \epsilon,;\epsilon\sim\mathcal{N}(0,1).start_ID bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT end_ID = square-root start_ARG over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + square-root start_ARG 1 - over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG italic_ϵ , italic_ϵ ∼ caligraphic_N ( 0 , 1 ) .(6)
The reverse diffusion process is modeled by a neural network trained to predict the parameters μ θ(𝐱 t,t)subscript 𝜇 𝜃 subscript 𝐱 𝑡 𝑡\mu_{\theta}(\mathbf{x}{t},t)italic_μ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) and Σ θ(𝐱 t,t)subscript Σ 𝜃 subscript 𝐱 𝑡 𝑡\Sigma{\theta}(\mathbf{x}_{t},t)roman_Σ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) of a Gaussian distribution
p θ(𝐱 t−1|𝐱 t)=𝒩(𝐱 t−1;μ θ(𝐱 t,t),Σ θ(𝐱 t,t)).subscript 𝑝 𝜃 conditional subscript 𝐱 𝑡 1 subscript 𝐱 𝑡 𝒩 subscript 𝐱 𝑡 1 subscript 𝜇 𝜃 subscript 𝐱 𝑡 𝑡 subscript Σ 𝜃 subscript 𝐱 𝑡 𝑡 p_{\theta}(\mathbf{x}{t-1}|\mathbf{x}{t})=\mathcal{N}(\mathbf{x}{t-1};\mu{% \theta}(\mathbf{x}{t},t),\Sigma{\theta}(\mathbf{x}_{t},t)).italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = caligraphic_N ( bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ; italic_μ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) , roman_Σ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) ) .(7)
As reported by Ho et al.[19], learning objective for the above model is derived by considering the variational lower bound,
𝔼[p θ(𝐱 0)]≤L=𝔼 q[D KL(q(𝐱 T|𝐱 0)||p(𝐱 T))⏟L T+∑t>1 D KL(q(𝐱 t−1|𝐱 t,𝐱 0)||p(𝐱 t−1|𝐱 t))⏟L t−1−logp θ(𝐱 0|𝐱 1)⏟L 0].\mathbb{E}[p_{\theta}(\mathbf{x}{0})]\leq L=\mathbb{E}{q}\bigg{[}\underbrace% {D_{KL}(q(\mathbf{x}{T}|\mathbf{x}{0}),||,p(\mathbf{x}{T}))}{L_{T}}\ +\sum_{t>1}\underbrace{D_{KL}(q(\mathbf{x}{t-1}|\mathbf{x}{t},\mathbf{x}{0}% ),||,p(\mathbf{x}{t-1}|\mathbf{x}{t}))}{L_{t-1}}\underbrace{-\text{log},% p_{\theta}(\mathbf{x}{0}|\mathbf{x}{1})}{L{0}}\bigg{]}.start_ROW start_CELL blackboard_E [ italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] ≤ italic_L = blackboard_E start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT [ under⏟ start_ARG italic_D start_POSTSUBSCRIPT italic_K italic_L end_POSTSUBSCRIPT ( italic_q ( bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | | italic_p ( bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) ) end_ARG start_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL + ∑ start_POSTSUBSCRIPT italic_t > 1 end_POSTSUBSCRIPT under⏟ start_ARG italic_D start_POSTSUBSCRIPT italic_K italic_L end_POSTSUBSCRIPT ( italic_q ( bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | | italic_p ( bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) end_ARG start_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT under⏟ start_ARG - log italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG start_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ] . end_CELL end_ROW(8)
Note that the term L t−1 subscript 𝐿 𝑡 1 L_{t-1}italic_L start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT is used to train the neural network and this can be computed in closed form as L t−1 subscript 𝐿 𝑡 1 L_{t-1}italic_L start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT compares two Gaussian distributions. As 𝐱 t subscript 𝐱 𝑡\mathbf{x}{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is available as an input during training, the predicted mean μ θ(𝐱 t,t)subscript 𝜇 𝜃 subscript 𝐱 𝑡 𝑡\mu{\theta}(\mathbf{x}_{t},t)italic_μ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) can be reparameterized as follows,
μ θ(𝐱 t,t)=1 α t(𝐱 t−β t 1−α¯tϵ θ(𝐱 t,t)).subscript 𝜇 𝜃 subscript 𝐱 𝑡 𝑡 1 subscript 𝛼 𝑡 subscript 𝐱 𝑡 subscript 𝛽 𝑡 1 subscript¯𝛼 𝑡 subscript italic-ϵ 𝜃 subscript 𝐱 𝑡 𝑡\mu_{\theta}(\mathbf{x}{t},t)=\frac{1}{\sqrt{\alpha{t}}}\bigg{(}\mathbf{x}{% t}-\frac{\beta{t}}{\sqrt{1-\bar{\alpha}{t}}}\epsilon{\theta}(\mathbf{x}_{t}% ,t)\bigg{)}.italic_μ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - divide start_ARG italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG 1 - over¯ start_ARG italic_α end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) ) .(9)
For simplicity, Ho et al.[19] derive the following training objective from L t−1 subscript 𝐿 𝑡 1 L_{t-1}italic_L start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT in Eq. (8).
L simple=𝔼 t,𝐱 0,ϵ[‖ϵ−ϵ θ(𝐱 t,t)‖2].subscript 𝐿 𝑠 𝑖 𝑚 𝑝 𝑙 𝑒 subscript 𝔼 𝑡 subscript 𝐱 0 italic-ϵ delimited-[]superscript norm italic-ϵ subscript italic-ϵ 𝜃 subscript 𝐱 𝑡 𝑡 2 L_{simple}=\mathbb{E}{t,\mathbf{x}{0},\epsilon}\big{[}|\epsilon-\epsilon_{% \theta}(\mathbf{x}_{t},t)|^{2}\big{]}.italic_L start_POSTSUBSCRIPT italic_s italic_i italic_m italic_p italic_l italic_e end_POSTSUBSCRIPT = blackboard_E start_POSTSUBSCRIPT italic_t , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ϵ end_POSTSUBSCRIPT [ ∥ italic_ϵ - italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] .(10)
II-B SAR-DDPM
The proposed method is based on a T 𝑇 T italic_T step DDPM model with a forward and a reverse diffusion processes as shown in Fig. 1. During the forward diffusion process, we use the clean image 𝐱 C subscript 𝐱 𝐶\mathbf{x}{C}bold_x start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT as the input image 𝐱 0 subscript 𝐱 0\mathbf{x}{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT which is converted into 𝐱 T subscript 𝐱 𝑇\mathbf{x}{T}bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT by gradually adding a Gaussian noise. The reverse process uses the conditional noise predictor ϵ θ subscript italic-ϵ 𝜃\epsilon{\theta}italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT to recover the clean image 𝐱 C subscript 𝐱 𝐶\mathbf{x}{C}bold_x start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT from 𝐱 T subscript 𝐱 𝑇\mathbf{x}{T}bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT by denoising iteratively in T 𝑇 T italic_T steps.
In the proposed method, the conditional noise predictor ϵ θ subscript italic-ϵ 𝜃\epsilon_{\theta}italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT is trained to predict the noise added in each diffusion step conditioned on the speckled image 𝐱 S subscript 𝐱 𝑆\mathbf{x}{S}bold_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT. The network architecture of the proposed conditional noise predictor is illustrated in Fig. 2. The proposed network follows a U-Net [25] like architecture which comprises of convolutional residual blocks, attention blocks, downsampling convolutions, upsampling convolutions, and skip connections connecting the contracting and expansive pathways. First, the speckled image 𝐱 S subscript 𝐱 𝑆\mathbf{x}{S}bold_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT and 𝐱 t subscript 𝐱 𝑡\mathbf{x}{t}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT are concatenated and passed through a convolutional layer, while the timestep t 𝑡 t italic_t is transformed to the timestep embedding t e subscript 𝑡 𝑒 t{e}italic_t start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT using the transformer sinusoidal positional encoding [26]. The time step embedding t e subscript 𝑡 𝑒 t_{e}italic_t start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is given as an input to each residual block as depicted in Fig. 2. Similar to [20], we use self-attention blocks at multiple resolutions, and BigGAN [27] residual blocks for upsampling and downsampling. Finally, the output from the last residual block of the expansive pathway is passed through a convolutional block to predict the noise ϵ^^italic-ϵ\hat{\epsilon}over^ start_ARG italic_ϵ end_ARG at the time step t−1 𝑡 1 t-1 italic_t - 1. The predicted noise is then used to obtain 𝐱 t−1 subscript 𝐱 𝑡 1\mathbf{x}_{t-1}bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT using Eq. (7) and (9).
Algorithm 1 summarises the training procedure of the proposed method where synthetically speckled images and their corresponding clean images are used to train the SAR-DDPM in T 𝑇 T italic_T diffusion steps. During inference, we adopt the idea of cycle spinning [24] to improve the performance of SAR-DDPM. The function f cs(.,u,v)f_{cs}(.,u,v)italic_f start_POSTSUBSCRIPT italic_c italic_s end_POSTSUBSCRIPT ( . , italic_u , italic_v ) is used to shift an image cyclically by u 𝑢 u italic_u rows and v 𝑣 v italic_v columns as depicted in Fig. 3. We despeckle the cyclically shifted images using the DDPM model, apply inverse cyclic shift, and average them to obtain the final despeckled image. The complete inference procedure is given in Algorithm 2.
Algorithm 1 Training
1:Speckled image and clean image pairs
P={(𝐱 S k,𝐱 C k)}k=1 K 𝑃 subscript superscript superscript subscript 𝐱 𝑆 𝑘 superscript subscript 𝐱 𝐶 𝑘 𝐾 𝑘 1 P={(\mathbf{x}{S}^{k},\mathbf{x}{C}^{k})}^{K}_{k=1}italic_P = { ( bold_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , bold_x start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) } start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT
2:repeat
3:
(𝐱 S,𝐱 𝐂)∼P similar-to subscript 𝐱 𝑆 𝐱 𝐂 𝑃(\mathbf{x}{S},\mathbf{x{C}})\sim P( bold_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT , start_ID bold_x start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT end_ID ) ∼ italic_P
4:
t∼Uniform({1,…,T})similar-to 𝑡 Uniform 1…𝑇 t\sim\text{Uniform}({1,\ldots,T})italic_t ∼ Uniform ( { 1 , … , italic_T } )
5:
ϵ∼𝒩(𝟎,𝐈)similar-to italic-ϵ 𝒩 𝟎 𝐈\epsilon\sim\mathcal{N}(\mathbf{0},\mathbf{I})italic_ϵ ∼ caligraphic_N ( bold_0 , bold_I )
6:Take gradient descent step on ∇θ‖ϵ−ϵ θ(𝐱 t,𝐱 S,t)‖2 subscript∇𝜃 superscript norm italic-ϵ subscript italic-ϵ 𝜃 subscript 𝐱 𝑡 subscript 𝐱 𝑆 𝑡 2\nabla_{\theta},||\epsilon-\epsilon_{\theta}(\mathbf{x}{t},\mathbf{x}{S},t)% ||^{2}∇ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT | | italic_ϵ - italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT , italic_t ) | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, 𝐱 t=α t¯𝐱 c+1−α t¯ϵ subscript 𝐱 𝑡¯subscript 𝛼 𝑡 subscript 𝐱 𝑐 1¯subscript 𝛼 𝑡 italic-ϵ\mathbf{x}{t}=\sqrt{\bar{\alpha{t}}}\mathbf{x}{c}+\sqrt{1-\bar{\alpha{t}}\epsilon}bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = square-root start_ARG over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG bold_x start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT + square-root start_ARG 1 - over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG italic_ϵ end_ARG
7:until converged
Algorithm 2 Inference
1:Speckled image
𝐱 S subscript 𝐱 𝑆\mathbf{x}_{S}bold_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ,
{u i}i=1 M subscript superscript subscript 𝑢 𝑖 𝑀 𝑖 1{u_{i}}^{M}_{i=1}{ italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT ,
{v i}i=1 M subscript superscript subscript 𝑣 𝑖 𝑀 𝑖 1{v_{i}}^{M}_{i=1}{ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT
2:for
i=1,…,M 𝑖 1…𝑀 i=1,\ldots,M italic_i = 1 , … , italic_M do
3:
𝐱 i=f cs(𝐱 S,u i,v i)subscript 𝐱 𝑖 subscript 𝑓 𝑐 𝑠 subscript 𝐱 𝑆 subscript 𝑢 𝑖 subscript 𝑣 𝑖\mathbf{x}{i}=f{cs}(\mathbf{x}{S},u{i},v_{i})bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_c italic_s end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )
4:Sample
𝐱 T∼𝒩(𝟎,𝐈)similar-to subscript 𝐱 𝑇 𝒩 𝟎 𝐈\mathbf{x}_{T}\sim\mathcal{N}(\mathbf{0},\mathbf{I})bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ∼ caligraphic_N ( bold_0 , bold_I )
5:for
t=T,…,1 𝑡 𝑇…1 t=T,\ldots,1 italic_t = italic_T , … , 1 do
6:sample
𝐳∼𝒩(𝟎,𝐈)similar-to 𝐳 𝒩 𝟎 𝐈\mathbf{z}\sim\mathcal{N}(\mathbf{0},\mathbf{I})bold_z ∼ caligraphic_N ( bold_0 , bold_I ) if
t>1 𝑡 1 t>1 italic_t > 1 else
𝐳=0 𝐳 0\mathbf{z}=0 bold_z = 0
7:compute 𝐱 t−1=1 α t(𝐱 t−1−α t 1−α t¯ϵ θ(𝐱 t,𝐱 i,t))+σ t𝐳 subscript 𝐱 𝑡 1 1 subscript 𝛼 𝑡 subscript 𝐱 𝑡 1 subscript 𝛼 𝑡 1¯subscript 𝛼 𝑡 subscript italic-ϵ 𝜃 subscript 𝐱 𝑡 subscript 𝐱 𝑖 𝑡 subscript 𝜎 𝑡 𝐳\mathbf{x}{t-1}=\frac{1}{\sqrt{\alpha{t}}}\Big{(}\mathbf{x}{t}-\frac{1-% \alpha{t}}{\sqrt{1-\bar{\alpha_{t}}}}\epsilon_{\theta}(\mathbf{x}{t},\mathbf% {x}{i},t)\Big{)}+\sigma_{t}\mathbf{z}bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - divide start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG 1 - over¯ start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG end_ARG italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_t ) ) + italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_z
8:end for
9:
𝐱 C i=f cs−1(𝐱 0,u i,v i)subscript 𝐱 subscript 𝐶 𝑖 superscript subscript 𝑓 𝑐 𝑠 1 subscript 𝐱 0 subscript 𝑢 𝑖 subscript 𝑣 𝑖\mathbf{x}{C{i}}=f_{cs}^{-1}(\mathbf{x}{0},u{i},v_{i})bold_x start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_c italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )
10:
𝐱 sum=𝐱 sum+𝐱 C i subscript 𝐱 𝑠 𝑢 𝑚 subscript 𝐱 𝑠 𝑢 𝑚 subscript 𝐱 subscript 𝐶 𝑖\mathbf{x}{sum}=\mathbf{x}{sum}+\mathbf{x}{C{i}}bold_x start_POSTSUBSCRIPT italic_s italic_u italic_m end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_s italic_u italic_m end_POSTSUBSCRIPT + bold_x start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT if
i>1 𝑖 1 i>1 italic_i > 1 else
𝐱 sum=𝐱 C i subscript 𝐱 𝑠 𝑢 𝑚 subscript 𝐱 subscript 𝐶 𝑖\mathbf{x}{sum}=\mathbf{x}{C_{i}}bold_x start_POSTSUBSCRIPT italic_s italic_u italic_m end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT
11:end for
12:return
𝐱 C=𝐱 sum M subscript 𝐱 𝐶 subscript 𝐱 𝑠 𝑢 𝑚 𝑀\mathbf{x}{C}=\frac{\mathbf{x}{sum}}{M}bold_x start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT = divide start_ARG bold_x start_POSTSUBSCRIPT italic_s italic_u italic_m end_POSTSUBSCRIPT end_ARG start_ARG italic_M end_ARG
Figure 3: Cyclically spinned images with : (a) u=0 𝑢 0 u=0 italic_u = 0,v=0 𝑣 0 v=0 italic_v = 0 (Original speckled image) , (b) u=100 𝑢 100 u=100 italic_u = 100, v=200 𝑣 200 v=200 italic_v = 200, and (c) u=200 𝑢 200 u=200 italic_u = 200, v=200 𝑣 200 v=200 italic_v = 200
Figure 4: Results on a synthetically speckled image: (a) Speckled image, (b) Ground Truth, (c) PPB, (d) SAR-BM3D, (e) ID-CNN, (f) Trans-SAR, (g) SAR-ON, (h) SAR-CAM, (i) SAR-DDPM, (j) SAR-DDPM + Cycle Spinning .
Figure 5: Results on real SAR images: (a) SAR image, (b) PPB, (c) SAR-BM3D, (d) ID-CNN, (e) Trans-SAR, (f) SAR-ON, (g) SAR-CAM, (h) SAR-DDPM, (i) SAR-DDPM + Cycle Spinning .
III Experiments and Results
In this section, we present the experiments and results of our proposed method on both synthetically speckled images and real SAR images. We compare the performance of SAR-DDPM with the following despeckling algorithms: PPB [4], SAR-BM3D [6], ID-CNN [11], Trans-SAR [18], SAR-ON [17], and SAR-CAM [16]. Following Eq. (1) and (2) , we created single-look (L=1 𝐿 1 L=1 italic_L = 1) synthetic speckle images using 15k remote sensing images from publicly available DSIFN [28] dataset in order to train SAR-DDPM. The network was trained on an NVIDIA RTX 2080Ti GPU for 30k iterations and the network weights were initialized with pretrained ImageNet weights from [20]. In this work, we set T=1000 𝑇 1000 T=1000 italic_T = 1000 and {u i}={v i}={0,100,200}subscript 𝑢 𝑖 subscript 𝑣 𝑖 0 100 200{u_{i}}={v_{i}}={0,100,200}{ italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } = { italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } = { 0 , 100 , 200 }.
Table I shows the performance of the proposed algorithm in terms of average Peak Signal-to-Noise Ratio (PSNR) and Structured Similarity Index (SSIM) on 10 synthetically speckled images from the DSIFN dataset. Note that ID-CNN, Trans-SAR, SAR-ON, and SAR-CAM were also trained on the same synthetic data as the proposed method. As can be seen from Table I, SAR-DDPM significantly outperforms both classical and deep learning-based despeckling methods in terms of PSNR and SSIM. Moreover, by the results in Table I and Fig. 4, we can observe that cycle spinning significantly improves the despeckling performance of SAR-DDPM. Also, it can be seen from Fig. 4 that our proposed method provides a better despeckling performance while preserving the fine structural details present in the speckled images compared to the other despeckling methods.
TABLE I: Results on synthetically speckled images of DSIFN.
Method PSNR SSIM PPB [4]23.96 dB 0.62 SAR-BM3D [6]25.69 dB 0.75 ID-CNN [11]27.30 dB 0.72 Trans-SAR [18]27.08 dB 0.72 SAR-ON [17]27.16 dB 0.73 SAR-CAM [16]27.96 dB 0.76 SAR-DDPM 27.99 dB 0.77 SAR-DDPM with cycle spinning 29.42 dB 0.81
In order to further evaluate the despeckling ability of our proposed method, the above despeckling methods are tested on 2 single-look SAR images of size 256×256 256 256 256\times 256 256 × 256 acquired using Sentinel-I [29]. The Equivalent Number of Looks (ENL) is an index suitable for evaluating the level of smoothing in homogeneous areas of SAR images when the clean ground truth images are unavailable [30]. ENL is defined as the ratio between the square of the mean and the variance of a homogeneous region. Table II summarizes the quantitative comparison of despeckling results of SAR images in terms of ENL. The regions selected for the ENL value calculation are marked in red and blue boxes in Fig. 5. The proposed method resulted in the highest ENL values in all 4 regions which signifies the best despeckling performance out of the considered despeckling methods. From Fig. 5, we can observe that quantitative results using ENL are consistent with the visual results.
In Fig. 5, the PPB despeckling approach over-smooths the SAR image, destroying edges and structural details. It can also be observed that, SAR-BM3D preserves significantly more details than PPB. However, as indicated by the ENL values, SAR-BM3D exhibits a poor performance when it comes to smoothing. ID-CNN, Trans-SAR, and SAR-ON preserve more textural details than PPB and remove more speckle than SAR-BM3D. Compared to ID-CNN, Trans-SAR and SAR-ON, SAR-CAM retains more textural details while minimizing distortions in homogeneous regions. Even though SAR-CAM showcase a better despeckling ability than the previous methods, it can be observed that SAR-CAM blurs out some fine edges and structural details. While SAR-DDPM is able to recover these finer details than SAR-CAM, it creates slight distortions in the homogeneous areas resulting in lower ENL values. As evident from Fig. 5 (i), the use of cycle spinning along with SAR-DDPM reduces these distortions and recovers finer structural details than SAR-CAM.
TABLE II: Estimated ENL values on real SAR images
IV Conclusion
We proposed SAR-DDPM, the first diffusion-based model for SAR despeckling. Our work performs despeckling through a novel strategy based upon ensembling multiple cycle spinning based estimates for the despeckled image generated using the diffusion model. Each estimate is obtained by starting from random Gaussian noise and iteratively denoising the latent variables using a noise predictor conditioned on the corresponding transformed speckled images. Our experiments on synthetic and real SAR images show promising quantitative and qualitative improvements compared with several existing despeckling methods. The proposed method proved to be effective in removing speckle while retaining fine structural details in the SAR images.
References
- [1] J.-S. Lee, “Speckle analysis and smoothing of synthetic aperture radar images,” Computer Graphics and Image Processing, vol.17, no.1, pp. 24–32, 1981. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0146664X81800056
- [2] D.T. Kuan, A.A. Sawchuk, T.C. Strand, and P.Chavel, “Adaptive noise smoothing filter for images with signal-dependent noise,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-7, no.2, pp. 165–177, 1985.
- [3] A.Lopes, E.Nezry, R.Touzi, and H.Laur, “Maximum a posteriori speckle filtering and first order texture models in sar images,” in 10th Annual International Symposium on Geoscience and Remote Sensing, 1990, pp. 2409–2412.
- [4] C.-A. Deledalle, L.Denis, and F.Tupin, “Iterative weighted maximum likelihood denoising with probabilistic patch-based weights,” IEEE Transactions on Image Processing, vol.18, no.12, pp. 2661–2672, 2009.
- [5] K.Dabov, A.Foi, V.Katkovnik, and K.Egiazarian, “Image denoising by sparse 3-d transform-domain collaborative filtering,” IEEE Transactions on Image Processing, vol.16, no.8, pp. 2080–2095, 2007.
- [6] S.Parrilli, M.Poderico, C.V. Angelino, and L.Verdoliva, “A nonlocal sar image denoising algorithm based on llmmse wavelet shrinkage,” IEEE Transactions on Geoscience and Remote Sensing, vol.50, no.2, pp. 606–616, 2012.
- [7] V.M. Patel, G.R. Easley, R.Chellappa, and N.M. Nasrabadi, “Separated component-based restoration of speckled sar images,” IEEE Transactions on Geoscience and Remote Sensing, vol.52, no.2, pp. 1019–1029, 2014.
- [8] R.Touzi, “A review of speckle filtering in the context of estimation theory,” IEEE Transactions on Geoscience and Remote Sensing, vol.40, no.11, pp. 2392–2404, 2002.
- [9] C.-A. Deledalle, L.Denis, G.Poggi, F.Tupin, and L.Verdoliva, “Exploiting patch similarity for sar image processing: The nonlocal paradigm,” IEEE Signal Processing Magazine, vol.31, no.4, pp. 69–78, 2014.
- [10] G.Chierchia, D.Cozzolino, G.Poggi, and L.Verdoliva, “Sar image despeckling through convolutional neural networks,” in 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2017, pp. 5438–5441.
- [11] P.Wang, H.Zhang, and V.M. Patel, “Sar image despeckling using a convolutional neural network,” IEEE Signal Processing Letters, vol.24, no.12, pp. 1763–1767, 2017.
- [12] ——, “Generative adversarial network-based restoration of speckled sar images,” in 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2017, pp. 1–5.
- [13] I.Goodfellow, J.Pouget-Abadie, M.Mirza, B.Xu, D.Warde-Farley, S.Ozair, A.Courville, and Y.Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, Z.Ghahramani, M.Welling, C.Cortes, N.Lawrence, and K.Weinberger, Eds., vol.27.Curran Associates, Inc., 2014. [Online]. Available: https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
- [14] S.Vitale, G.Ferraioli, and V.Pascazio, “Multi-objective cnn-based algorithm for sar despeckling,” IEEE Transactions on Geoscience and Remote Sensing, vol.59, no.11, pp. 9336–9349, 2021.
- [15] S.Liu, Y.Lei, L.Zhang, B.Li, W.Hu, and Y.-D. Zhang, “Mrddanet: A multiscale residual dense dual attention network for sar image denoising,” IEEE Transactions on Geoscience and Remote Sensing, pp. 1–13, 2021.
- [16] J.Ko and S.Lee, “Sar image despeckling using continuous attention module,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol.15, pp. 3–19, 2022.
- [17] M.V. Perera, W.G.C. Bandara, J.M.J. Valanarasu, and V.M. Patel, “Sar despeckling using overcomplete convolutional networks,” 2022. [Online]. Available: https://arxiv.org/abs/2205.15906
- [18] ——, “Transformer-based sar image despeckling,” 2022. [Online]. Available: https://arxiv.org/abs/2201.09355
- [19] J.Ho, A.Jain, and P.Abbeel, “Denoising diffusion probabilistic models,” in Advances in Neural Information Processing Systems, H.Larochelle, M.Ranzato, R.Hadsell, M.Balcan, and H.Lin, Eds., vol.33.Curran Associates, Inc., 2020, pp. 6840–6851. [Online]. Available: https://proceedings.neurips.cc/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
- [20] P.Dhariwal and A.Nichol, “Diffusion models beat gans on image synthesis,” in Advances in Neural Information Processing Systems, M.Ranzato, A.Beygelzimer, Y.Dauphin, P.Liang, and J.W. Vaughan, Eds., vol.34.Curran Associates, Inc., 2021, pp. 8780–8794. [Online]. Available: https://proceedings.neurips.cc/paper/2021/file/49ad23d1ec9fa4bd8d77d02681df5cfa-Paper.pdf
- [21] H.Li, Y.Yang, M.Chang, S.Chen, H.Feng, Z.Xu, Q.Li, and Y.Chen, “Srdiff: Single image super-resolution with diffusion probabilistic models,” Neurocomputing, vol. 479, pp. 47–59, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231222000522
- [22] A.Lugmayr, M.Danelljan, A.Romero, F.Yu, R.Timofte, and L.Van Gool, “Repaint: Inpainting using denoising diffusion probabilistic models,” 2022. [Online]. Available: https://arxiv.org/abs/2201.09865
- [23] B.Kawar, M.Elad, S.Ermon, and J.Song, “Denoising diffusion restoration models,” 2022. [Online]. Available: https://arxiv.org/abs/2201.11793
- [24] R.R. Coifman and D.L. Donoho, Translation-Invariant De-Noising, A.Antoniadis and G.Oppenheim, Eds.New York, NY: Springer New York, 1995. [Online]. Available: https://doi.org/10.1007/978-1-4612-2544-7_9
- [25] O.Ronneberger, P.Fischer, and T.Brox, “U-net: Convolutional networks for biomedical image segmentation,” in MICCAI, 2015.
- [26] A.Vaswani, N.Shazeer, N.Parmar, J.Uszkoreit, L.Jones, A.N. Gomez, L.u. Kaiser, and I.Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, I.Guyon, U.V. Luxburg, S.Bengio, H.Wallach, R.Fergus, S.Vishwanathan, and R.Garnett, Eds., vol.30.Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- [27] A.Brock, J.Donahue, and K.Simonyan, “Large scale GAN training for high fidelity natural image synthesis,” in International Conference on Learning Representations, 2019. [Online]. Available: https://openreview.net/forum?id=B1xsqj09Fm
- [28] C.Zhang, P.Yue, D.Tapete, L.Jiang, B.Shangguan, L.Huang, and G.Liu, “A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 166, pp. 183–200, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0924271620301532
- [29] M.Schmitt, L.Hughes, and X.Zhu, “The sen1-2 dataset for deep learning in sar-optical data fusion,” in ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. IV-1, 09 2018, pp. 141–146.
- [30] F.Argenti, A.Lapini, T.Bianchi, and L.Alparone, “A tutorial on speckle reduction in synthetic aperture radar images,” IEEE Geoscience and Remote Sensing Magazine, vol.1, no.3, pp. 6–35, 2013.
Xet Storage Details
- Size:
- 50.5 kB
- Xet hash:
- 9f9254e1bc1b50801a706baa84f46e243a03153bb49035b83a31874f3caf78e7
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.




