File size: 7,927 Bytes
44d8277 b9b76d4 44d8277 b9b76d4 44d8277 b9b76d4 44d8277 c930e23 44d8277 b9b76d4 44d8277 b9b76d4 44d8277 b9b76d4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
---
datasets:
- FFHQ256
- Satellite_PCRS
- BSDS500
- CelebA
library_name: PyTorch
tags:
- compression
- compressAI
- VAE
pipeline_tag: image-to-image
license: apache-2.0
---
# Description of available models
The models are variational autoencoders (VAEs) and compressive autoencoders (CAEs), with an additional variance decoder, that can be used for restoring images using
Variational Bayes Latent Estimation (VBLE) algorithm.
- **Associated GitHub Repository:** [Github Repo](https://github.com/MaudBqrd/VBLExz)
- **Associated Papers:** [Deep Priors for satellite image restoration with accurate uncertainties](https://huggingface.co/papers/2412.04130),
[Variational Bayes image restoration with compressive autoencoders](https://ieeexplore.ieee.org/abstract/document/10982450)
## Models Details
The models are simple VAEs trained on CelebA, and CAEs mbt [1] and cheng [2] trained on different datasets: FFHQ [3], BSDS500 [4],
and a realistic satellite dataset simulated from PCRS [5].
## Quick description of all models
<ins>1lvae-fcb_gamma-variable_M-64_celeba-wb_std-diagonal</ins>:
- Architecture: VAE with fully connected bottleneck, ```latent_dimension = 64```.
- Dataset: Celeba (white and black)
<ins>1lvae-fcb_gamma-variable_M-256_celeba-wb_std-diagonal</ins>:
- Architecture: VAE with fully connected bottleneck, ```latent_dimension = 256```.
- Dataset: Celeba (white and black)
<ins>1lvae-light_gamma-variable_M-64_celeba-wb_std-diagonal</ins>:
- Architecture: VAE with fully convolutionnal bottleneck, ```latent_dimension = 64```.
- Dataset: Celeba (white and black)
<ins>cheng_0.0483_bsd_std-diagonal</ins>
- Architecture: cheng [2] model with latent dimension ```M = 192```
- Dataset: BSDS500 (RGB)
- Bitrate parameter ```alpha = 0.0483``` (medium bitrate model)
<ins>cheng_0.0483_ffhq_std-diagonal</ins>
- Architecture: cheng [2] model with latent dimension ```M = 192```
- Dataset: FFHQ256 (RGB)
- Bitrate parameter ```alpha = 0.0483``` (medium bitrate model)
<ins>cheng_0.1800_bsd_std-diagonal</ins>
- Architecture: cheng [2] model with latent dimension ```M = 192```
- Dataset: on BSDS500 (RGB)
- Bitrate parameter ```alpha = 0.1800``` (high bitrate model)
<ins>cheng_0.1800_ffhq_std-diagonal</ins>
- Architecture: cheng [2] model with latent dimension ```M = 192```
- Dataset: FFHQ256 (RGB)
- Bitrate parameter ```alpha = 0.1800``` (high bitrate model)
<ins>mbt_0.0483_bsd_std-diagonal</ins>
- Architecture: mbt [1] model with latent dimension ```M = 320```
- Dataset: BSDS500 (RGB)
- Bitrate parameter ```alpha = 0.0483``` (medium bitrate model)
<ins>mbt_0.0483_ffhq_std-diagonal</ins>
- Architecture: mbt [1] model with latent dimension ```M = 320```
- Dataset: FFHQ256 (RGB)
- Bitrate parameter ```alpha = 0.0483``` (medium bitrate model)
<ins>mbt_0.1800_bsd_std-diagonal</ins>
- Architecture: mbt [1] model with latent dimension ```M = 320```
- Dataset: BSDS500 (RGB)
- Bitrate parameter ```alpha = 0.1800``` (high bitrate model)
<ins>mbt_0.1800_ffhq_std-diagonal</ins>
- Architecture: mbt [1] model with latent dimension ```M = 320```
- Dataset: FFHQ256 (RGB)
- Bitrate parameter ```alpha = 0.1800``` (high bitrate model)
<ins>mbt_25cm_PCRS_0.3600_std-diagonal</ins>:
- Architecture: mbt [1] model with latent dimension ```M = 320```.
- Dataset: PCRS (satellite images downsampled at 25cm resolution, white and black)
- Bitrate parameter ```alpha = 0.3600``` (very high bitrate model)
<ins>mbt_50cm_PCRS_0.3600_std-diagonal</ins>:
- Architecture: mbt [1] model with latent dimension ```M = 320```.
- Dataset: PCRS (satellite images downsampled at 50cm resolution, white and black)
- Bitrate parameter ```alpha = 0.3600``` (very high bitrate model)
## Training Details
### Training Procedure
**Pretraining**: None for VAEs, pretrained model from [CompressAI](https://interdigitalinc.github.io/CompressAI/) for CAEs.
**Two-stage training**:
- Encoder and decoder finetuning.
- Variance decoder training (from scratch), with MLE_loss and a diagonal Gaussian decoder model.
<ins>VAEs specificities</ins>: Isotropic decoder model of deviation ```gamma``` optimized as a NN parameter in first stage.
<ins>CAEs specificities</ins>: Fixed bitrate parameter ```alpha``` (=equivalent of ```gamma``` for CAEs) in first stage.
#### Training Hyperparameters
- **Training regime:** fp32
**First stage**
- ```lr=1e-4```
- ```optimizer=Adam```
- ```batch_size=256``` for VAEs, ```batch_size=16``` for CAEs
- ```patch_size=64``` for CelebA, ```patch_size=256``` otherwise
- ```clip_max_norm=20``` (gradient clipping)
- ```parameters=autoencoder```
- ```loss_type=elbo```
- ```sample_rate_scale=false``` (wether to modulate the bitrate for CAEs during training)
**Second stage**
- ```lr=1e-4```
- ```optimizer=Adam```
- ```batch_size=256``` for VAEs, ```batch_size=16``` for CAEs
- ```patch_size=64``` for CelebA, ```patch_size=256``` otherwise
- ```clip_max_norm=1``` (gradient clipping)
- ```parameters=dec_variance```
- ```loss_type=elbo```
- ```sample_rate_scale=true``` for CAEs, ```false``` for VAEs(wether to modulate the bitrate for CAEs during training)
### Model Architecture and Objective
**VAEs**
1lvae-fcb: 4 convolutional layers in each module (encoder, decoder, variance decoder), fully connected bottleneck, ```latent_dimension = M``` specified in the name.
1lvae-light: 4 convolutional layers in each module (encoder, decoder, variance decoder), fully convolutionnal bottleneck,
```latent_dimension = M``` specified in the name.
**CAEs**
CAE with an hyperprior (=two latent variables) and an autotoregressive module. See [1] and [2] for mbt and cheng architectures.
## Citation
**APA:**
Biquard, M., Chabert, M., Genin, F., Latry, C., & Oberlin, T. (2025). Deep priors for satellite image restoration with accurate uncertainties. IEEE Transactions on Geoscience and Remote Sensing, 63, 1-16.
**BibTeX:**
@ARTICLE{11258607,
author={Biquard, Maud and Chabert, Marie and Genin, Florence and Latry, Christophe and Oberlin, Thomas},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={Deep Priors for Satellite Image Restoration With Accurate Uncertainties},
year={2025},
volume={63},
number={},
pages={1-16},
keywords={Image restoration;Inverse problems;Uncertainty;Satellites;Satellite images;Optical imaging;Image resolution;Optical sensors;Image coding;Autoencoders;Deep regularization (DR);latent optimization;plug-and-play (PnP) methods;posterior sampling;satellite image restoration (IR);uncertainty quantification (UQ)},
doi={10.1109/TGRS.2025.3633774}}
**APA:**
Biquard, M., Chabert, M., Genin, F., Latry, C., & Oberlin, T. (2025). Deep priors for satellite image restoration with accurate uncertainties. IEEE Transactions on Geoscience and Remote Sensing, 63, 1-16.
## Model Card Contact
Contact: maud.biquard@laposte.net
## References
[1] Minnen, D., Ballé, J., & Toderici, G. D. (2018). Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems, 31.
[2] Cheng, Z., Sun, H., Takeuchi, M., & Katto, J. (2020). Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7939-7948).
[3] Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4401-4410).
[4] Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001, July). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings eighth IEEE international conference on computer vision. ICCV 2001 (Vol. 2, pp. 416-423). IEEE.
[5] Institut Géographique National (IGN), [https://www.data.gouv.fr/datasets/pcrs/] |