File size: 3,194 Bytes
8da2bab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
license: mit
tags:
  - image-super-resolution
  - infrared
  - remote-sensing
  - ntire2026
  - pytorch
datasets:
  - custom
metrics:
  - psnr
  - ssim
pipeline_tag: image-to-image
---

# NTIRE2026 Infrared Super-Resolution Models

Pre-trained models for the **NTIRE2026 Infrared Image Super-Resolution (x4) Challenge**.

## Competition

| Item | Detail |
|------|--------|
| **Task** | Single-image super-resolution (x4) for infrared remote sensing |
| **Metric** | `Score = PSNR + 20 x SSIM` (intensity channel, 4px border shave) |
| **Dataset** | 919 train, 52 val, 222 test infrared images |

## Available Models

| Model | File | Score | PSNR | SSIM | Params | Architecture |
|-------|------|-------|------|------|--------|-------------|
| RFRSR v10 (split) | `rfrsr_v10_split_iter46k.pth` | **51.57** | 33.88 | 0.8822 | 2.05M | Recurrent Feature Refinement |
| RFRSR v2 | `rfrsr_v2_iter250.pth` | 27.59 | 15.89 | 0.5806 | 2.05M | Recurrent Feature Refinement |
| MambaOutRS v12 | `mambaoutrs_v12_iter500.pth` | 27.52 | 16.03 | 0.5811 | 2.96M | Gated CNN + Fourier Filter Gate |
| MambaOutRS v10 | `mambaoutrs_v10_iter500.pth` | 25.64 | 14.81 | 0.5372 | 2.96M | Gated CNN + Fourier Filter Gate |
| HSRMamba | `hsrmamba_iter20k.pth` | 25.56 | 14.74 | 0.5365 | 2.4M | Context-SSM + Spectral Reordering |

## Usage

```python
import torch
from PIL import Image
import numpy as np

# Load model (example with RFRSR)
checkpoint = torch.load("rfrsr_v10_split_iter46k.pth", map_location="cpu")

# For full inference pipeline with TTA and post-processing:
# See https://github.com/danghoangnhan/NTIRE2026
```

### Full Inference with Submission Creator

```bash
git clone https://github.com/danghoangnhan/NTIRE2026.git
cd NTIRE2026
pip install -e .

python src/create_submission.py \
  --input-dir data/test_LR_X4/X4 \
  --output-dir output/ \
  --weights-path rfrsr_v10_split_iter46k.pth \
  --arch-config src/options/train/train_rfr_sr_x4_v10_split.yml \
  --tta
```

## Architecture Details

### RFRSR (Best)
- 3-iteration recurrent feature refinement loop
- 6 residual blocks, 48 channels per stage
- PixelShuffle 4x upsampling
- Only 2.05M parameters

### MambaOutRS
- Gated CNN with Fourier Filter Gate (no SSM)
- 4-stage [6,6,6,6] block design, 48 embedding dims
- 2.96M parameters

### HSRMamba
- Context Selective State Space Model
- Global Spectral Reordering for token reorganization
- Uniquely stable (improves past 20k iterations)
- 2.4M parameters

## Key Findings

1. **Smaller models win**: 2.05M RFRSR > 9.5M MiM-ISTD
2. **SSIM in loss is critical**: Prevents catastrophic training collapse
3. **Models peak early** (250-1000 iters), then degrade
4. **TTA adds +0.1-0.3 dB** for free at test time

## Training

All models trained with:
- **Loss**: Charbonnier + SSIM + Gradient + FFT + SWT (IRSRCombinedLoss)
- **Optimizer**: Adam
- **Framework**: BasicSR
- **Input**: Grayscale infrared images (`force_gray: true`)

## License

MIT

## Citation

```bibtex
@misc{ntire2026_infraredsr,
  title={NTIRE2026 Infrared Super-Resolution: A Study of Architectures and Training Strategies},
  author={Daniel Ho},
  year={2026},
  url={https://github.com/danghoangnhan/NTIRE2026}
}
```