Fourier Inception Gated Super Resolution
The main idea of the model is to integrate the FourierUnit into the GatedCNN pipeline in order to strengthen the modelβs global perception with minimal computational overhead.
The FourierUnit adds feature processing in the frequency domain, expanding the effective receptive field, while the GatedCNN provides efficient local modeling and control of information flow through a gating mechanism. Their combination allows merging global context and computational efficiency within a compact SISR architecture.
TODO:
- Fix trt inference
Showcase:
Model structure:
figsr
GDB FU
Main blocks and their changes relative to the originals:
GatedCNN β borrowed from the MambaOut repository with the following changes:
Linearreplaced withConvto avoid unnecessarypermuteoperations;- one of the linear layers replaced with a
Conv 3Γ3, which improves quality without a significant increase in computational cost; LayerNormreplaced withRMSNormfor speed and greater stability;DConvreplaced withInceptionConv.
InceptionConv β a modified version of the block from InceptionNeXt:
DConvreplaced with standard convolutions;- kernel sizes increased following the findings of PLKSR;
- the shortcut replaced with
FourierUnit, which improves convergence because a residual connection is already present insideGatedCNN.
FourierUnit β a modified version of the block from SFHformer:
BatchNormreplaced withRMSNorm, which works better with the small batch sizes typical for SISR;- structural changes made for correct export to ONNX;
- post-normalization added, since without it training instability and
NaNvalues were observed in the context ofGatedCNN.
Metrics:
- Metrics were computed using PyIQA, except for those starting with βbsβ, which were calculated using BasicSR.
Esrgan DF2K
| Dataset | SSIM-Y | PSNR-Y | TOPIQ | bs_ssim_y | bs_psnr_y |
|---|---|---|---|---|---|
| BHI100 | 0.7150 | 22.84 | 0.5694 | 0.7279 | 24.1636 |
| psisrd_val125 | 0.7881 | 27.01 | 0.6043 | 0.8034 | 28.3273 |
| set14 | 0.7730 | 27.67 | 0.6905 | 0.7915 | 28.9969 |
| urban100 | 0.8025 | 25.71 | 0.6701 | 0.8152 | 27.0282 |
FIGSR BHI
| Dataset | SSIM-Y | PSNR-Y | TOPIQ | bs_ssim_y | bs_psnr_y |
|---|---|---|---|---|---|
| BHI100 | 0.7196 | 22.83 | 0.5723 | 0.7327 | 24.1549 |
| psisrd_val125 | 0.7911 | 26.97 | 0.6095 | 0.8065 | 28.2946 |
| set14 | 0.7769 | 27.70 | 0.7036 | 0.7952 | 29.0221 |
| urban100 | 0.8056 | 25.80 | 0.6725 | 0.8185 | 27.1170 |
Performance 3060 12gb:
| Model | input_size | params β | avg_inference β | fps β | memory_use β |
|---|---|---|---|---|---|
| ESRGAN | 1024x1024 | ~16.6m | ~2.8s | 0.3483220866736526 | 8.29GB |
| FIGSR | 1024x1024 | ~4.4m | ~1.64s | 0.6081749253740837 | 2.26GB |
Training
To train, choose one of the frameworks and place the model file in the archs folder:
NeoSR β
figsr_arch.pyβneosr/archs/figsr_arch.py. ConfigtraiNNer-redux β
figsr_arch.pyβtraiNNer/archs/figsr_arch.py. ConfigBasicSR β
figsr_arch.pyβbasicsr/archs/figsr_arch.py. Config
Inference:
Resselt install
uv venv --python=3.12
source .venv/bin/activate
uv pip install "resselt==1.3.1" "pepeline==1.2.3"
main.py
python main.py --input_dir urban/x4 --output_dir urban/x4_scale --weights 4x_FIGSR.safetensors