Image-Text-to-Image
Safetensors
English
SPAD
Photons
Generative
ISP
gQIR / README.md
aRy4n's picture
Add real dataset
6439fbe verified
metadata
license: cc-by-4.0
language:
  - en
base_model:
  - ByteDance/sd2.1-base-zsnr-laionaes5
pipeline_tag: image-text-to-image
tags:
  - SPAD
  - Photons
  - Generative
  - ISP
datasets:
  - aRy4n/eXtreme-Deformable
  - aRy4n/real-color-SPAD-indoor6
metrics:
  - type: accuracy
    split: test
    task:
      type: video-to-image
      name: Burst Reconstruction

gQIR: Generative Quanta Image Reconstruction

Aryan Garg1, Sizhuo Ma2, Mohit Gupta1

1 University of Wisconsin-Madison 2 Snap, Inc

color_spads

All model weights are available here now!

Color-Model Name Stage Bit Depth 🤗 Download Link
qVAE Stage 1 1-bit 1965000.pt
Adversarial Diffusion LoRA-UNet Stage 2 1-bit state_dict.pth
qVAE Stage 1 3-bit 0105000.pt
Adversarial Diffusion LoRA-UNet Stage 2 3-bit state_dict.pth
FusionViT Stage 3 3-bit fusion_vit_0050000.pt

Code at github.com/Aryan-Garg/gQIR

ArXiv Version: arxiv.org/abs/2602.20417

Cite Us:

@InProceedings{garg_2026_gqir,
    author    = {Garg, Aryan and Ma, Sizhuo and  Gupta, Mohit},
    title     = {gQIR: Generative Quanta Image Reconstruction},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2026},
}