--- license: mit tags: - image-super-resolution - infrared - remote-sensing - ntire2026 - pytorch datasets: - custom metrics: - psnr - ssim pipeline_tag: image-to-image --- # NTIRE2026 Infrared Super-Resolution Models Pre-trained models for the **NTIRE2026 Infrared Image Super-Resolution (x4) Challenge**. ## Competition | Item | Detail | |------|--------| | **Task** | Single-image super-resolution (x4) for infrared remote sensing | | **Metric** | `Score = PSNR + 20 x SSIM` (intensity channel, 4px border shave) | | **Dataset** | 919 train, 52 val, 222 test infrared images | ## Available Models | Model | File | Score | PSNR | SSIM | Params | Architecture | |-------|------|-------|------|------|--------|-------------| | RFRSR v10 (split) | `rfrsr_v10_split_iter46k.pth` | **51.57** | 33.88 | 0.8822 | 2.05M | Recurrent Feature Refinement | | RFRSR v2 | `rfrsr_v2_iter250.pth` | 27.59 | 15.89 | 0.5806 | 2.05M | Recurrent Feature Refinement | | MambaOutRS v12 | `mambaoutrs_v12_iter500.pth` | 27.52 | 16.03 | 0.5811 | 2.96M | Gated CNN + Fourier Filter Gate | | MambaOutRS v10 | `mambaoutrs_v10_iter500.pth` | 25.64 | 14.81 | 0.5372 | 2.96M | Gated CNN + Fourier Filter Gate | | HSRMamba | `hsrmamba_iter20k.pth` | 25.56 | 14.74 | 0.5365 | 2.4M | Context-SSM + Spectral Reordering | ## Usage ```python import torch from PIL import Image import numpy as np # Load model (example with RFRSR) checkpoint = torch.load("rfrsr_v10_split_iter46k.pth", map_location="cpu") # For full inference pipeline with TTA and post-processing: # See https://github.com/danghoangnhan/NTIRE2026 ``` ### Full Inference with Submission Creator ```bash git clone https://github.com/danghoangnhan/NTIRE2026.git cd NTIRE2026 pip install -e . python src/create_submission.py \ --input-dir data/test_LR_X4/X4 \ --output-dir output/ \ --weights-path rfrsr_v10_split_iter46k.pth \ --arch-config src/options/train/train_rfr_sr_x4_v10_split.yml \ --tta ``` ## Architecture Details ### RFRSR (Best) - 3-iteration recurrent feature refinement loop - 6 residual blocks, 48 channels per stage - PixelShuffle 4x upsampling - Only 2.05M parameters ### MambaOutRS - Gated CNN with Fourier Filter Gate (no SSM) - 4-stage [6,6,6,6] block design, 48 embedding dims - 2.96M parameters ### HSRMamba - Context Selective State Space Model - Global Spectral Reordering for token reorganization - Uniquely stable (improves past 20k iterations) - 2.4M parameters ## Key Findings 1. **Smaller models win**: 2.05M RFRSR > 9.5M MiM-ISTD 2. **SSIM in loss is critical**: Prevents catastrophic training collapse 3. **Models peak early** (250-1000 iters), then degrade 4. **TTA adds +0.1-0.3 dB** for free at test time ## Training All models trained with: - **Loss**: Charbonnier + SSIM + Gradient + FFT + SWT (IRSRCombinedLoss) - **Optimizer**: Adam - **Framework**: BasicSR - **Input**: Grayscale infrared images (`force_gray: true`) ## License MIT ## Citation ```bibtex @misc{ntire2026_infraredsr, title={NTIRE2026 Infrared Super-Resolution: A Study of Architectures and Training Strategies}, author={Daniel Ho}, year={2026}, url={https://github.com/danghoangnhan/NTIRE2026} } ```