gblsr / README.md

Add arXiv + DOI badges to the model card

c11b2ea verified about 6 hours ago

4.7 kB

	---
	license: cc-by-nc-4.0
	tags:
	- super-resolution
	- arbitrary-scale-super-resolution
	- image-restoration
	- implicit-neural-representation
	- gblsr
	pipeline_tag: image-to-image
	metrics:
	- psnr
	- ssim
	- lpips
	---

	# GB-LSR: Global-Bandwidth Local Spectral Representation

	[![arXiv](https://img.shields.io/badge/arXiv-2606.19617-b31b1b.svg)](https://arxiv.org/abs/2606.19617) [![DOI](https://img.shields.io/badge/DOI-10.48550%2FarXiv.2606.19617-blue.svg)](https://doi.org/10.48550/arXiv.2606.19617)

	Trained weights for GB-LSR, a fixed-grid local spectral image
	representation: the image is partitioned into non-overlapping patches, each
	carrying a small block of truncated-Fourier coefficients predicted from a
	shared convolutional encoder, with a single trainable scalar bandwidth shared
	globally across all patches. Reconstruction at any continuous coordinate is a
	fixed-size basis contraction whose cost is independent of image size.

	Code: <https://github.com/KempnerInstitute/gblsr> · Paper:
	[arXiv:2606.19617](https://arxiv.org/abs/2606.19617).

	## Models

	\| Folder \| Model \| Params \| Notes \|
	\|---\|---\|---\|---\|
	\| `gblsr-scalar` \| GB-LSR-Scalar (native reconstruction) \| 0.99M \| main native-benchmark variant \|
	\| `gblsr-scalar-asr` \| GB-LSR-Scalar-ASR (base) \| 22.02M \| arbitrary-scale SR, RDN encoder \|
	\| `gblsr-scalar-asr-noLE` \| ASR, no local ensemble \| 22.02M \| faster; trained + eval'd without 4-corner LE \|
	\| `gblsr-scalar-asr-nf96-noLE` \| ASR, wider encoder + noLE \| 24.93M \| small quality gain \|
	\| `gblsr-scalar-asr-nf48-noLE` \| ASR, narrower encoder + noLE \| 20.61M \| aggressive-efficiency \|

	One representative seed (seed 0) is released per model; per-seed variance is
	negligible (e.g. ASR PSNR-Y std about 0.004 dB).

	## Usage

	Install the package (provides the architecture), then load the weights:

	```bash
	pip install git+https://github.com/KempnerInstitute/gblsr
	```

	Native reconstruction (GB-LSR-Scalar):

	```python
	import torch
	from safetensors.torch import load_file
	from gblsr import build_model, ModelConfig, BasisConfig, EncoderConfig

	model = build_model(
	ModelConfig(arm="local_spectral", image_size=256, patch_size=32,
	basis=BasisConfig(patch_size=32, p_max=16, s_e_range=(0.25, 2.0)),
	encoder=EncoderConfig()),
	bandwidth_mode="global_scalar", adapt_order=False).eval()
	model.load_state_dict(load_file("gblsr-scalar/model.safetensors"))

	out = model(torch.rand(1, 3, 256, 256)) # dict of outputs
	recon = out["recon"] # (1, 3, 256, 256) reconstruction
	```

	Arbitrary-scale SR (any GB-LSR-Scalar-ASR variant):

	```python
	import torch
	from safetensors.torch import load_file
	from gblsr import GBLSRScalarASR

	# match encoder_cfg / decoder_cfg to the folder's config.json
	model = GBLSRScalarASR(encoder_cfg={"num_features": 96},
	decoder_cfg={"local_ensemble": False}).eval() # nf96+noLE
	model.load_state_dict(load_file("gblsr-scalar-asr-nf96-noLE/model.safetensors"))

	lr = torch.rand(1, 3, 64, 64)
	hr = model.predict_full(lr, H_q=256, W_q=256) # (1, 3, 256, 256), any target size
	```

	Each model's exact build config is in its `config.json`.

	## Training

	All variants trained on a DTD + DIV2K mixture (native) / DIV2K (ASR), three
	seeds, AdamW, pointwise MSE. The ASR models train for 1,000,000 steps. See the
	paper's appendix for the full protocol.

	## Evaluation (summary)

	Under a fixed matched-budget protocol, GB-LSR-Scalar outperforms matched-budget
	amortized LIIF / LTE / WIRE re-implementations on the standardized
	256x256 native-reconstruction benchmark while running at roughly one-quarter of
	the slowest baseline's inference cost. The arbitrary-scale base model runs
	1.44x faster than LIIF-RDN and 3.25x faster than LTE-SwinIR at x4 with
	competitive PSNR-Y; the noLE / nf-variants trade quality for further speed and
	memory. Full tables, baselines, and the scope of the comparison are in the
	paper; these are not state-of-the-art claims over the broader SR literature.

	## License and attribution

	Released under CC BY-NC 4.0 (non-commercial). The weights are derived from
	training on DIV2K ("for academic research purpose only") and DTD;
	downstream use must respect those source-dataset terms. The `gblsr` code is
	BSD-3-Clause; this non-commercial term applies to the trained weights here.

	## Citation

	```bibtex
	@article{shad2026gblsr,
	title = {GB-LSR: A Fast Local Spectral Image Representation with a
	Single Global Bandwidth for Continuous Reconstruction and
	Super-Resolution},
	author = {Shad, Max and Khoshnevis, Naeem},
	journal = {arXiv preprint arXiv:2606.19617},
	year = {2026}
	}
	```