File size: 6,190 Bytes
c4b45fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4f763cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c4b45fc
4f763cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
license: mit
datasets:
- Phips/BHI
pipeline_tag: image-to-image
tags:
- SISR
- single-image-super-resolution
- super-resolution
- sota
- fourier-transform
- restoration
- sota-model
- figsr
---
# Fourier Inception Gated Super Resolution

The main idea of the model is to integrate the [FourierUnit](https://github.com/deng-ai-lab/SFHformer/blob/1f7994112b9ced9153edc7187e320e0383a9dfd3/models/SFHformer.py#L143) into the [GatedCNN](https://github.com/yuweihao/MambaOut/blob/main/models/mambaout.py#L119) pipeline in order to strengthen the model’s global perception with minimal computational overhead.

The FourierUnit adds feature processing in the frequency domain, expanding the effective receptive field, while the GatedCNN provides efficient local modeling and control of information flow through a gating mechanism. Their combination allows merging global context and computational efficiency within a compact SISR architecture.

---
# TODO:
+ [ ] Fix trt inference
---
## Showcase:
[show pics](https://slow.pics/s/fPvcS3P0?image-fit=contain) 

[gdrive](https://drive.google.com/drive/u/1/folders/1ofJo5CCgrOtLdVm9psmlJv15Z3aP4Aiz)

---
## Model structure:

### figsr

<img src="figs/FIDSR.png" width="600"/>

### GDB FU

<img src="figs/gdb_and_FU.png" width="600"/>

---

### Main blocks and their changes relative to the originals:

* [GatedCNN](https://github.com/yuweihao/MambaOut/blob/main/models/mambaout.py#L119) β€” borrowed from the [MambaOut](https://github.com/yuweihao/MambaOut/blob/main/models/mambaout.py#L119) repository with the following changes:

  * `Linear` replaced with `Conv` to avoid unnecessary `permute` operations;
  * one of the linear layers replaced with a `Conv 3Γ—3`, which improves quality without a significant increase in computational cost;
  * `LayerNorm` replaced with `RMSNorm` for speed and greater stability;
  * `DConv` replaced with `InceptionConv`.

* [InceptionConv](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L627) β€” a modified version of the block from [InceptionNeXt](https://github.com/sail-sg/inceptionnext/blob/main/models/inceptionnext.py#L19):

  * `DConv` replaced with standard convolutions;
  * kernel sizes increased following the findings of [PLKSR](https://github.com/dslisleedh/PLKSR);
  * the shortcut replaced with `FourierUnit`, which improves convergence because a residual connection is already present inside `GatedCNN`.

* [FourierUnit](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L585) β€” a modified version of the block from [SFHformer](https://github.com/deng-ai-lab/SFHformer/blob/1f7994112b9ced9153edc7187e320e0383a9dfd3/models/SFHformer.py#L143):

  * `BatchNorm` replaced with `RMSNorm`, which works better with the small batch sizes typical for SISR;
  * structural changes made for correct export to ONNX;
  * post-normalization added, since without it training instability and `NaN` values were observed in the context of `GatedCNN`.

---

## Metrics:
* Metrics were computed using [PyIQA](https://github.com/chaofengc/IQA-PyTorch/tree/main), except for those starting with β€œbs”, which were calculated using BasicSR.
### [Esrgan DF2K](https://drive.google.com/file/d/1mSJ6Z40weL-dnPvi390xDd3uZBCFMeqr/view?usp=sharing):
| Dataset       | SSIM-Y | PSNR-Y | TOPIQ  | bs_ssim_y | bs_psnr_y |
| ------------- | ------ | ------ | ------ | --------- | --------- |
| BHI100        | 0.7150 | 22.84  | 0.5694 | 0.7279    | 24.1636   |
| psisrd_val125 | 0.7881 | 27.01  | 0.6043 | 0.8034    | 28.3273   |
| set14         | 0.7730 | 27.67  | 0.6905 | 0.7915    | 28.9969   |
| urban100      | 0.8025 | 25.71  | 0.6701 | 0.8152    | 27.0282   |
### [FIGSR BHI](https://huggingface.co/enhancr-dev/figsr/blob/main/weight/v1.0.0):
| Dataset       | SSIM-Y | PSNR-Y | TOPIQ  | bs_ssim_y | bs_psnr_y |
| ------------- | ------ | ------ | ------ | --------- | --------- |
| BHI100        | 0.7196 | 22.83  | 0.5723 | 0.7327    | 24.1549   |
| psisrd_val125 | 0.7911 | 26.97  | 0.6095 | 0.8065    | 28.2946   |
| set14         | 0.7769 | 27.70  | 0.7036 | 0.7952    | 29.0221   |
| urban100      | 0.8056 | 25.80  | 0.6725 | 0.8185    | 27.1170   |

---

## Performance 3060 12gb:
| Model  | input_size | params ↓ | avg_inference ↓ | fps ↑              | memory_use ↓ |
|--------| ---------- | -------- |-----------------| ------------------ | ------------ |
| ESRGAN | 1024x1024  | ~16.6m   | ~2.8s           | 0.3483220866736526 | 8.29GB       |
| FIGSR  | 1024x1024  | ~4.4m    | ~1.64s          | 0.6081749253740837 | 2.26GB       |

## Training

To train, choose one of the frameworks and place the model file in the `archs` folder:

* **[NeoSR](https://github.com/neosr-project/neosr)** β€” `figsr_arch.py` β†’ `neosr/archs/figsr_arch.py`. [Config](configs/neosr.toml)

  * Uncomment lines [14–17](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L14-L17), [694](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L694) and [705](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L705).
  * Comment out line [703](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L703).

* **[traiNNer-redux](https://github.com/the-database/traiNNer-redux)** β€” `figsr_arch.py` β†’ `traiNNer/archs/figsr_arch.py`. [Config](configs/trainner-redux.yml)

  * Uncomment lines [11](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L11) and [694](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L694).

* **[BasicSR](https://github.com/XPixelGroup/BasicSR/tree/master/basicsr/archs)** β€” `figsr_arch.py` β†’ `basicsr/archs/figsr_arch.py`. [Config](configs/basicsr.yml)

  * Uncomment lines [19](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L19) and [694](https://huggingface.co/enhancr-dev/figsr/blob/main/figsr_arch.py#L694).

---

## Inference:
### Resselt install
```shell
uv venv  --python=3.12
source .venv/bin/activate
uv pip install "resselt==1.3.1" "pepeline==1.2.3"
```
### main.py
```shell
 python main.py --input_dir urban/x4 --output_dir urban/x4_scale --weights  4x_FIGSR.safetensors 
```
---
## Contacts:
[discord](https://discord.gg/xwZfWWMwBq)