| --- |
| license: mit |
| tags: |
| - image-super-resolution |
| - infrared |
| - remote-sensing |
| - ntire2026 |
| - pytorch |
| datasets: |
| - custom |
| metrics: |
| - psnr |
| - ssim |
| pipeline_tag: image-to-image |
| --- |
| |
| # NTIRE2026 Infrared Super-Resolution Models |
|
|
| Pre-trained models for the **NTIRE2026 Infrared Image Super-Resolution (x4) Challenge**. |
|
|
| ## Competition |
|
|
| | Item | Detail | |
| |------|--------| |
| | **Task** | Single-image super-resolution (x4) for infrared remote sensing | |
| | **Metric** | `Score = PSNR + 20 x SSIM` (intensity channel, 4px border shave) | |
| | **Dataset** | 919 train, 52 val, 222 test infrared images | |
|
|
| ## Available Models |
|
|
| | Model | File | Score | PSNR | SSIM | Params | Architecture | |
| |-------|------|-------|------|------|--------|-------------| |
| | RFRSR v10 (split) | `rfrsr_v10_split_iter46k.pth` | **51.57** | 33.88 | 0.8822 | 2.05M | Recurrent Feature Refinement | |
| | RFRSR v2 | `rfrsr_v2_iter250.pth` | 27.59 | 15.89 | 0.5806 | 2.05M | Recurrent Feature Refinement | |
| | MambaOutRS v12 | `mambaoutrs_v12_iter500.pth` | 27.52 | 16.03 | 0.5811 | 2.96M | Gated CNN + Fourier Filter Gate | |
| | MambaOutRS v10 | `mambaoutrs_v10_iter500.pth` | 25.64 | 14.81 | 0.5372 | 2.96M | Gated CNN + Fourier Filter Gate | |
| | HSRMamba | `hsrmamba_iter20k.pth` | 25.56 | 14.74 | 0.5365 | 2.4M | Context-SSM + Spectral Reordering | |
|
|
| ## Usage |
|
|
| ```python |
| import torch |
| from PIL import Image |
| import numpy as np |
| |
| # Load model (example with RFRSR) |
| checkpoint = torch.load("rfrsr_v10_split_iter46k.pth", map_location="cpu") |
| |
| # For full inference pipeline with TTA and post-processing: |
| # See https://github.com/danghoangnhan/NTIRE2026 |
| ``` |
|
|
| ### Full Inference with Submission Creator |
|
|
| ```bash |
| git clone https://github.com/danghoangnhan/NTIRE2026.git |
| cd NTIRE2026 |
| pip install -e . |
| |
| python src/create_submission.py \ |
| --input-dir data/test_LR_X4/X4 \ |
| --output-dir output/ \ |
| --weights-path rfrsr_v10_split_iter46k.pth \ |
| --arch-config src/options/train/train_rfr_sr_x4_v10_split.yml \ |
| --tta |
| ``` |
|
|
| ## Architecture Details |
|
|
| ### RFRSR (Best) |
| - 3-iteration recurrent feature refinement loop |
| - 6 residual blocks, 48 channels per stage |
| - PixelShuffle 4x upsampling |
| - Only 2.05M parameters |
|
|
| ### MambaOutRS |
| - Gated CNN with Fourier Filter Gate (no SSM) |
| - 4-stage [6,6,6,6] block design, 48 embedding dims |
| - 2.96M parameters |
|
|
| ### HSRMamba |
| - Context Selective State Space Model |
| - Global Spectral Reordering for token reorganization |
| - Uniquely stable (improves past 20k iterations) |
| - 2.4M parameters |
|
|
| ## Key Findings |
|
|
| 1. **Smaller models win**: 2.05M RFRSR > 9.5M MiM-ISTD |
| 2. **SSIM in loss is critical**: Prevents catastrophic training collapse |
| 3. **Models peak early** (250-1000 iters), then degrade |
| 4. **TTA adds +0.1-0.3 dB** for free at test time |
|
|
| ## Training |
|
|
| All models trained with: |
| - **Loss**: Charbonnier + SSIM + Gradient + FFT + SWT (IRSRCombinedLoss) |
| - **Optimizer**: Adam |
| - **Framework**: BasicSR |
| - **Input**: Grayscale infrared images (`force_gray: true`) |
|
|
| ## License |
|
|
| MIT |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{ntire2026_infraredsr, |
| title={NTIRE2026 Infrared Super-Resolution: A Study of Architectures and Training Strategies}, |
| author={Daniel Ho}, |
| year={2026}, |
| url={https://github.com/danghoangnhan/NTIRE2026} |
| } |
| ``` |
|
|