CallMeDaniel's picture
Add model card
8da2bab verified
---
license: mit
tags:
- image-super-resolution
- infrared
- remote-sensing
- ntire2026
- pytorch
datasets:
- custom
metrics:
- psnr
- ssim
pipeline_tag: image-to-image
---
# NTIRE2026 Infrared Super-Resolution Models
Pre-trained models for the **NTIRE2026 Infrared Image Super-Resolution (x4) Challenge**.
## Competition
| Item | Detail |
|------|--------|
| **Task** | Single-image super-resolution (x4) for infrared remote sensing |
| **Metric** | `Score = PSNR + 20 x SSIM` (intensity channel, 4px border shave) |
| **Dataset** | 919 train, 52 val, 222 test infrared images |
## Available Models
| Model | File | Score | PSNR | SSIM | Params | Architecture |
|-------|------|-------|------|------|--------|-------------|
| RFRSR v10 (split) | `rfrsr_v10_split_iter46k.pth` | **51.57** | 33.88 | 0.8822 | 2.05M | Recurrent Feature Refinement |
| RFRSR v2 | `rfrsr_v2_iter250.pth` | 27.59 | 15.89 | 0.5806 | 2.05M | Recurrent Feature Refinement |
| MambaOutRS v12 | `mambaoutrs_v12_iter500.pth` | 27.52 | 16.03 | 0.5811 | 2.96M | Gated CNN + Fourier Filter Gate |
| MambaOutRS v10 | `mambaoutrs_v10_iter500.pth` | 25.64 | 14.81 | 0.5372 | 2.96M | Gated CNN + Fourier Filter Gate |
| HSRMamba | `hsrmamba_iter20k.pth` | 25.56 | 14.74 | 0.5365 | 2.4M | Context-SSM + Spectral Reordering |
## Usage
```python
import torch
from PIL import Image
import numpy as np
# Load model (example with RFRSR)
checkpoint = torch.load("rfrsr_v10_split_iter46k.pth", map_location="cpu")
# For full inference pipeline with TTA and post-processing:
# See https://github.com/danghoangnhan/NTIRE2026
```
### Full Inference with Submission Creator
```bash
git clone https://github.com/danghoangnhan/NTIRE2026.git
cd NTIRE2026
pip install -e .
python src/create_submission.py \
--input-dir data/test_LR_X4/X4 \
--output-dir output/ \
--weights-path rfrsr_v10_split_iter46k.pth \
--arch-config src/options/train/train_rfr_sr_x4_v10_split.yml \
--tta
```
## Architecture Details
### RFRSR (Best)
- 3-iteration recurrent feature refinement loop
- 6 residual blocks, 48 channels per stage
- PixelShuffle 4x upsampling
- Only 2.05M parameters
### MambaOutRS
- Gated CNN with Fourier Filter Gate (no SSM)
- 4-stage [6,6,6,6] block design, 48 embedding dims
- 2.96M parameters
### HSRMamba
- Context Selective State Space Model
- Global Spectral Reordering for token reorganization
- Uniquely stable (improves past 20k iterations)
- 2.4M parameters
## Key Findings
1. **Smaller models win**: 2.05M RFRSR > 9.5M MiM-ISTD
2. **SSIM in loss is critical**: Prevents catastrophic training collapse
3. **Models peak early** (250-1000 iters), then degrade
4. **TTA adds +0.1-0.3 dB** for free at test time
## Training
All models trained with:
- **Loss**: Charbonnier + SSIM + Gradient + FFT + SWT (IRSRCombinedLoss)
- **Optimizer**: Adam
- **Framework**: BasicSR
- **Input**: Grayscale infrared images (`force_gray: true`)
## License
MIT
## Citation
```bibtex
@misc{ntire2026_infraredsr,
title={NTIRE2026 Infrared Super-Resolution: A Study of Architectures and Training Strategies},
author={Daniel Ho},
year={2026},
url={https://github.com/danghoangnhan/NTIRE2026}
}
```