--- license: cc-by-4.0 language: - en base_model: - ByteDance/sd2.1-base-zsnr-laionaes5 pipeline_tag: image-text-to-image tags: - SPAD - Photons - Generative - ISP datasets: - aRy4n/eXtreme-Deformable - aRy4n/real-color-SPAD-indoor6 metrics: - type: accuracy split: test task: type: video-to-image name: Burst Reconstruction --- ## gQIR: Generative Quanta Image Reconstruction [Aryan Garg](https://aryan-garg.github.io/)1, [Sizhuo Ma](https://sizhuoma.netlify.app/)2, [Mohit Gupta](https://wisionlab.com/people/mohit-gupta/)1 1 University of Wisconsin-Madison 2 Snap, Inc
![color_spads](assets/README_teaser_color_SPAD.png) ## All model weights are available here now! | Color-Model Name | Stage | Bit Depth | 🤗 Download Link | |:---|:---:|:---:|:---| | qVAE | Stage 1 | 1-bit | [1965000.pt](https://huggingface.co/aRy4n/gQIR/resolve/main/1-bit/1965000.pt) | | Adversarial Diffusion LoRA-UNet | Stage 2 | 1-bit | [state_dict.pth](https://huggingface.co/aRy4n/gQIR/resolve/main/1-bit/state_dict.pth) | | qVAE | Stage 1 | 3-bit | [0105000.pt](https://huggingface.co/aRy4n/gQIR/resolve/main/0105000.pt) | | Adversarial Diffusion LoRA-UNet | Stage 2 | 3-bit | [state_dict.pth](https://huggingface.co/aRy4n/gQIR/resolve/main/state_dict.pth) | | FusionViT | Stage 3 | 3-bit | [fusion_vit_0050000.pt](https://huggingface.co/aRy4n/gQIR/resolve/main/fusion_vit_0050000.pt) | Code at [github.com/Aryan-Garg/gQIR](https://github.com/Aryan-Garg/gQIR) ArXiv Version: [arxiv.org/abs/2602.20417](https://arxiv.org/abs/2602.20417) #### Cite Us: ```bibtex @InProceedings{garg_2026_gqir, author = {Garg, Aryan and Ma, Sizhuo and Gupta, Mohit}, title = {gQIR: Generative Quanta Image Reconstruction}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, } ```