aRy4n
/

gQIR

+---
+license: mit
+language:
+- en
+base_model:
+- ByteDance/sd2.1-base-zsnr-laionaes5
+pipeline_tag: image-text-to-image
+tags:
+- SPAD
+- Photons
+- Generative
+- ISP
+- Vision
+- https://arxiv.org/abs/2602.20417
+---
+## gQIR: Generative Quanta Image Reconstruction
+[Aryan Garg](https://aryan-garg.github.io/)<sup>1</sup>, [Sizhuo Ma](https://sizhuoma.netlify.app/)<sup>2</sup>, [Mohit Gupta](https://wisionlab.com/people/mohit-gupta/)<sup>1</sup>
+<sup>1</sup> University of Wisconsin-Madison <sup>2</sup> Snap, Inc<br>
+![color_spads](assets/README_teaser_color_SPAD.png)
+Abstract:
+Capturing high-quality images from only a few detected photons is a fundamental challenge in computational imaging.
+Single-photon avalanche diode (SPAD) sensors promise high-quality imaging in regimes where conventional cameras fail, but raw *quanta frames* contain only sparse, noisy, binary photon detections.
+Recovering a coherent image from a burst of such frames requires handling alignment, denoising, and demosaicing (for color) under noise statistics far outside those assumed by standard restoration pipelines or modern generative models.
+We present an approach that adapts large text-to-image latent diffusion models to the photon-limited domain of quanta burst imaging.
+Our method leverages the structural and semantic priors of internet-scale diffusion models while introducing mechanisms to handle Bernoulli photon statistics.
+By integrating latent-space restoration with burst-level spatio-temporal reasoning, our approach produces reconstructions that are both photometrically faithful and perceptually pleasing, even under high-speed motion.
+We evaluate the method on synthetic benchmarks and new real-world datasets, including the first color SPAD burst dataset and a challenging *Extreme-Deforming (XD)* video benchmark.
+Across all settings, the approach substantially improves perceptual quality over classical and modern learning-based baselines, demonstrating the promise of adapting large generative priors to extreme photon-limited sensing.
+Code at [github.com/Aryan-Garg/gQIR](https://github.com/Aryan-Garg/gQIR).
+#### Cite Us:
+```bibtex
+@InProceedings{garg_2026_gqir,
+    author    = {Garg, Aryan and Ma, Sizhuo and  Gupta, Mohit},
+    title     = {gQIR: Generative Quanta Image Reconstruction},
+    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+    month     = {June},
+    year      = {2026},
+}
+```