Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,51 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
base_model:
|
| 6 |
+
- ByteDance/sd2.1-base-zsnr-laionaes5
|
| 7 |
+
pipeline_tag: image-text-to-image
|
| 8 |
+
tags:
|
| 9 |
+
- SPAD
|
| 10 |
+
- Photons
|
| 11 |
+
- Generative
|
| 12 |
+
- ISP
|
| 13 |
+
- Vision
|
| 14 |
+
- https://arxiv.org/abs/2602.20417
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## gQIR: Generative Quanta Image Reconstruction
|
| 18 |
+
|
| 19 |
+
[Aryan Garg](https://aryan-garg.github.io/)<sup>1</sup>, [Sizhuo Ma](https://sizhuoma.netlify.app/)<sup>2</sup>, [Mohit Gupta](https://wisionlab.com/people/mohit-gupta/)<sup>1</sup>
|
| 20 |
+
|
| 21 |
+
<sup>1</sup> University of Wisconsin-Madison <sup>2</sup> Snap, Inc<br>
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+

|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
Abstract:
|
| 30 |
+
Capturing high-quality images from only a few detected photons is a fundamental challenge in computational imaging.
|
| 31 |
+
Single-photon avalanche diode (SPAD) sensors promise high-quality imaging in regimes where conventional cameras fail, but raw *quanta frames* contain only sparse, noisy, binary photon detections.
|
| 32 |
+
Recovering a coherent image from a burst of such frames requires handling alignment, denoising, and demosaicing (for color) under noise statistics far outside those assumed by standard restoration pipelines or modern generative models.
|
| 33 |
+
We present an approach that adapts large text-to-image latent diffusion models to the photon-limited domain of quanta burst imaging.
|
| 34 |
+
Our method leverages the structural and semantic priors of internet-scale diffusion models while introducing mechanisms to handle Bernoulli photon statistics.
|
| 35 |
+
By integrating latent-space restoration with burst-level spatio-temporal reasoning, our approach produces reconstructions that are both photometrically faithful and perceptually pleasing, even under high-speed motion.
|
| 36 |
+
We evaluate the method on synthetic benchmarks and new real-world datasets, including the first color SPAD burst dataset and a challenging *Extreme-Deforming (XD)* video benchmark.
|
| 37 |
+
Across all settings, the approach substantially improves perceptual quality over classical and modern learning-based baselines, demonstrating the promise of adapting large generative priors to extreme photon-limited sensing.
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
Code at [github.com/Aryan-Garg/gQIR](https://github.com/Aryan-Garg/gQIR).
|
| 41 |
+
|
| 42 |
+
#### Cite Us:
|
| 43 |
+
```bibtex
|
| 44 |
+
@InProceedings{garg_2026_gqir,
|
| 45 |
+
author = {Garg, Aryan and Ma, Sizhuo and Gupta, Mohit},
|
| 46 |
+
title = {gQIR: Generative Quanta Image Reconstruction},
|
| 47 |
+
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
|
| 48 |
+
month = {June},
|
| 49 |
+
year = {2026},
|
| 50 |
+
}
|
| 51 |
+
```
|