Image-Text-to-Image
Safetensors
English
SPAD
Photons
Generative
ISP
aRy4n commited on
Commit
9719e2e
·
verified ·
1 Parent(s): 4f43f99

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -3
README.md CHANGED
@@ -1,3 +1,51 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - ByteDance/sd2.1-base-zsnr-laionaes5
7
+ pipeline_tag: image-text-to-image
8
+ tags:
9
+ - SPAD
10
+ - Photons
11
+ - Generative
12
+ - ISP
13
+ - Vision
14
+ - https://arxiv.org/abs/2602.20417
15
+ ---
16
+
17
+ ## gQIR: Generative Quanta Image Reconstruction
18
+
19
+ [Aryan Garg](https://aryan-garg.github.io/)<sup>1</sup>, [Sizhuo Ma](https://sizhuoma.netlify.app/)<sup>2</sup>, [Mohit Gupta](https://wisionlab.com/people/mohit-gupta/)<sup>1</sup>
20
+
21
+ <sup>1</sup> University of Wisconsin-Madison <sup>2</sup> Snap, Inc<br>
22
+
23
+
24
+
25
+
26
+ ![color_spads](assets/README_teaser_color_SPAD.png)
27
+
28
+
29
+ Abstract:
30
+ Capturing high-quality images from only a few detected photons is a fundamental challenge in computational imaging.
31
+ Single-photon avalanche diode (SPAD) sensors promise high-quality imaging in regimes where conventional cameras fail, but raw *quanta frames* contain only sparse, noisy, binary photon detections.
32
+ Recovering a coherent image from a burst of such frames requires handling alignment, denoising, and demosaicing (for color) under noise statistics far outside those assumed by standard restoration pipelines or modern generative models.
33
+ We present an approach that adapts large text-to-image latent diffusion models to the photon-limited domain of quanta burst imaging.
34
+ Our method leverages the structural and semantic priors of internet-scale diffusion models while introducing mechanisms to handle Bernoulli photon statistics.
35
+ By integrating latent-space restoration with burst-level spatio-temporal reasoning, our approach produces reconstructions that are both photometrically faithful and perceptually pleasing, even under high-speed motion.
36
+ We evaluate the method on synthetic benchmarks and new real-world datasets, including the first color SPAD burst dataset and a challenging *Extreme-Deforming (XD)* video benchmark.
37
+ Across all settings, the approach substantially improves perceptual quality over classical and modern learning-based baselines, demonstrating the promise of adapting large generative priors to extreme photon-limited sensing.
38
+
39
+
40
+ Code at [github.com/Aryan-Garg/gQIR](https://github.com/Aryan-Garg/gQIR).
41
+
42
+ #### Cite Us:
43
+ ```bibtex
44
+ @InProceedings{garg_2026_gqir,
45
+ author = {Garg, Aryan and Ma, Sizhuo and Gupta, Mohit},
46
+ title = {gQIR: Generative Quanta Image Reconstruction},
47
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
48
+ month = {June},
49
+ year = {2026},
50
+ }
51
+ ```