ONNX
PyTorch
English
RyzenAI
super resolution
SISR
sesr / README_ben.md
bconsolvo's picture
updates readme
85c8ade
metadata
license: apache-2.0
datasets:
  - eugenesiow/Div2k
language:
  - en
tags:
  - RyzenAI
  - Super Resolution
  - SISR
  - SESR
  - ONNX

πŸš€ SESR-S on AMD AI PC NPU

Bhardwaj et al. (2022) introduced the Super-Efficient Super Resolution (SESR) model to solve a classic computer vision problem: to take a low-resolution input image and output a high-resolution image. The SESR model is based on a "linear overparameterization of CNNs and creates an efficient model architecture for [Single Image Super Resolution (SISR)]." The official code can be found at their accompanying GitHub repository: https://github.com/ARM-software/sesr. One of the main ideas behind the model was to make it very computationally efficient.

This version of the model is the SESR-S (Small) version; it has been converted from PyTorch format to ONNX, and then quantized to INT8 to run on an AMD AI PC NPU with Ryzen AI software. The model in its current form natively accepts a 256x256 RGB image and outputs a 512x512 RGB image; however, alternate versions of the model could accept 1920x1080 and upscale to 3840x2160 (4K) or 7680x4320 (8K).

Model Details Description
Person or organization developing model Tong Shen (AMD), Benjamin Consolvo (AMD)
Model date January 9, 2026
Model version 1
Model type Super-Resolution (Image-to-Image)
Information about training algorithms, parameters, fairness constraints or other applied approaches, and features The Γ—2\times2 SESR was trained for "300 epochs using ADAM optimizer with a constant learning rate of 5Γ—10βˆ’45 \times 10^{-4} and a batch size of 32 on DIV2K training set." And the Γ—4\times4 SESR model starts with the pretrained Γ—2\times2 SESR model and replaces "the final layer of 5Γ—5Γ—fΓ—45 \times 5 \times f \times 4 with a 5Γ—5Γ—fΓ—165 \times 5 \times f \times 16 and then perform[s] the depth-to-space operation twice" (Bhardwaj et al., 2022). For more training details, refer to the paper.
Paper or other resource for more information Bhardwaj, K., Milosavljevic, M., O'Neil, L., Gope, D., Matas, R., Chalfin, A., ... & Loh, D. (2022). Collapsible linear blocks for super-efficient super resolution. Proceedings of machine learning and systems, 4, 529-547
License Apache 2.0
Where to send questions or comments about the model Community Tab and AMD Developer Community Discord

⚑ Intended Use

Intended Use Description
Primary intended uses The model can be used to create high-resolution images from low-resolution images. The model has been converted to ONNX format and quantized for optimized performance on AMD AI PC NPUs.
Primary intended users Anyone using or evaluating super-resolution models on AMD AI PCs.
Out-of-scope uses This model is not intended for generating misinformation or disinformation, impersonating others, facilitating or inciting harassment or violence, any use that could lead to the violation of a human right.

How to Use

πŸ“ Hardware Prerequisites

Before getting started, make sure you meet the minimum hardware and OS requirements:

Series Codename Abbreviation Launch Year Windows 11 Linux
Ryzen AI Max PRO 300 Series Strix Halo STX 2025 β˜‘οΈ
Ryzen AI PRO 300 Series Strix Point / Krackan Point STX/KRK 2025 β˜‘οΈ
Ryzen AI Max 300 Series Strix Halo STX 2025 β˜‘οΈ
Ryzen AI 300 Series Strix Point STX 2025 β˜‘οΈ
Ryzen Pro 200 Series Hawk Point HPT 2025 β˜‘οΈ
Ryzen 200 Series Hawk Point HPT 2025 β˜‘οΈ
Ryzen PRO 8000 Series Hawk Point HPT 2024 β˜‘οΈ
Ryzen 8000 Series Hawk Point HPT 2024 β˜‘οΈ
Ryzen Pro 7000 Series Phoenix PHX 2023 β˜‘οΈ
Ryzen 7000 Series Phoenix PHX 2023 β˜‘οΈ

Getting Started

  1. Follow the instructions here to download necessary NPU drivers and Ryzen AI software: Ryzen AI SW Installation Instructions. Please allow for around 30 minutes to install all of the necessary components of Ryzen AI SW. The tested working version as of writing is Ryzen AI 1.7.0.

  2. Activate the previously installed conda environment from Ryzen AI (RAI) SW, and set the RAI environment variable to your installation path:

conda activate ryzen-ai-1.7.0
$Env:RYZEN_AI_INSTALLATION_PATH = 'C:/Program Files/RyzenAI/1.7.0/'
  1. Clone the Hugging Face model repository:
git clone https://huggingface.co/amd/sesr
  1. Install the prerequisites:
pip install -r requirements.txt

Quantitative Analyses

Regime Model Parameters MACs Set5 Set14 BSD100 Urban100 Manga109 DIV2K
Small Bicubic - - 33.68/0.9307 30.24/0.8693 29.56/0.8439 26.88/0.8408 30.82/0.9349 32.45/0.9043
FSRCNN (our setup) 12.46K 6.00G 36.85/0.9561 32.47/0.9076 31.37/0.8891 29.43/0.8963 35.81/0.9689 34.73/0.9349
FSRCNN (Dong et al., 2016) 12.46K 6.00G 36.98/0.9556 32.62/0.9087 31.50/0.8904 29.85/0.9009 36.62/0.9710 34.74/0.9340
MOREMNAS-C (Chu et al., 2020) 25K 5.5G 37.06/0.9561 32.75/0.9094 31.50/0.8904 29.92/0.9023 -/- -/-
SESR-M3 (f=16, m=3) 8.91K 2.05G 37.21/0.9577 32.70/0.9100 31.56/0.8920 29.92/0.9034 36.47/0.9717 35.03/0.9373
SESR-M5 (f=16, m=5) 13.52K 3.11G 37.39/0.9585 32.84/0.9115 31.70/0.8938 30.33/0.9087 37.07/0.9734 35.24/0.9389
SESR-M7 (f=16, m=7) 18.12K 4.17G 37.47/0.9588 32.91/0.9118 31.77/0.8946 30.49/0.9105 37.14/0.9738 35.32/0.9395
Medium TPSR-NoGAN (Lee et al., 2020) 60K 14.0G 37.38/0.9583 33.00/0.9123 31.75/0.8942 30.61/0.9119 -/- -/-
SESR-M11 (f=16, m=11) 27.34K 6.30G 37.58/0.9593 33.03/0.9128 31.85/0.8956 30.72/0.9136 37.40/0.9746 35.45/0.9404
Large VDSR (Kim et al., 2016) 665K 612.6G 37.53/0.9587 33.05/0.9127 31.90/0.8960 30.77/0.9141 37.16/0.9740 35.43/0.9410
LapSRN (Lai et al., 2017) 813K 29.9G 37.52/0.9590 33.08/0.9130 31.80/0.8950 30.41/0.9100 37.53/0.9740 35.31/0.9400
BTSRN (Fan et al., 2017) 410K 207.7G 37.75/- 33.20/- 32.05/- 31.63/- -/- -/-
CARN-M (Ahn et al., 2018) 412K 91.2G 37.53/0.9583 33.26/0.9141 31.92/0.8960 31.23/0.9193 -/- -/-
MOREMNAS-B (Chu et al., 2020) 1118K 256.9G 37.58/0.9584 33.22/0.9135 31.91/0.8959 31.14/0.9175 -/- -/-
SESR-XL (f=32, m=11) 105.37K 24.27G 37.77/0.9601 33.24/0.9145 31.99/0.8976 31.16/0.9184 38.01/0.9759 35.67/0.9420

Table 1: "PSNR/SSIM results on Γ—\times2 Super Resolution on several benchmark datasets. MACs are reported as the number of multiply-adds needed to convert an image to 720p (1280 Γ—\times 720) resolution via Γ—\times2 SISR." Highlights indicate best score within each regime. Table from Bhardwaj et al. (2022).

Regime Model Parameters MACs Set5 Set14 BSD100 Urban100 Manga109 DIV2K
Small Bicubic - - 28.43/0.8113 26.00/0.7025 25.96/0.6682 23.14/0.6577 24.90/0.7855 28.10/0.7745
FSRCNN (our setup) 12.46K 4.63G 30.45/0.8648 27.44/0.7528 26.89/0.7124 24.39/0.7212 27.40/0.8539 29.37/0.8117
FSRCNN (Dong et al., 2016) 12.46K 4.63G 30.70/0.8657 27.59/0.7535 26.96/0.7128 24.60/0.7258 27.89/0.8590 29.36/0.8110
SESR-M3 (f=16, m=3) 13.71K 0.79G 30.75/0.8714 27.62/0.7579 27.00/0.7166 24.61/0.7304 27.90/0.8644 29.52/0.8155
SESR-M5 (f=16, m=5) 18.32K 1.05G 30.99/0.8764 27.81/0.7624 27.11/0.7199 24.80/0.7389 28.29/0.8734 29.65/0.8189
SESR-M7 (f=16, m=7) 22.92K 1.32G 31.14/0.8787 27.88/0.7641 27.13/0.7209 24.90/0.7436 28.53/0.8778 29.72/0.8204
Medium TPSR-NoGAN (Lee et al., 2020) 61K 3.6G 31.10/0.8779 27.95/0.7663 27.15/0.7214 24.97/0.7456 -/- -/-
SESR-M11 (f=16, m=11) 32.14K 1.85G 31.27/0.8810 27.94/0.7660 27.20/0.7225 25.00/0.7466 28.73/0.8815 29.81/0.8221
Large VDSR (Kim et al., 2016) 665K 612.6G 31.35/0.8838 28.02/0.7678 27.29/0.7252 25.18/0.7525 28.82/0.8860 29.82/0.8240
LapSRN (Lai et al., 2017) 813K 149.4G 31.54/0.8850 28.19/0.7720 27.32/0.7280 25.21/0.7560 29.09/0.8900 29.88/0.8250
BTSRN (Fan et al., 2017) 410K 165.2G 31.85/- 28.20/- 27.47/- 25.74/- -/- -/-
CARN-M (Ahn et al., 2018) 412K 32.5G 31.92/0.8903 28.42/0.7762 27.44/0.7304 25.62/0.7694 -/- -/-
SESR-XL (f=32, m=11) 114.97K 6.62G 31.54/0.8866 28.12/0.7712 27.31/0.7277 25.31/0.7604 29.04/0.8901 29.94/0.8266

Table 2: "PSNR/SSIM results on Γ—\times4 Super Resolution on several benchmark datasets. MACs are reported as the number of multiply-adds needed to convert an image to 720p (1280 Γ—\times 720) resolution via Γ—\times4 SISR." Highlights indicate best score within each regime. Table from Bhardwaj et al. (2022).

Model description

SESR is based on linear overparameterization of CNNs and creates an efficient model architecture for SISR. It was introduced in the paper Collapsible Linear Blocks for Super-Efficient Super Resolution. The official code for this work is available at this https://github.com/ARM-software/sesr

We develop a modified version that could be supported by AMD Ryzen AI.

Intended uses & limitations

You can use the raw model for super resolution. See the model hub to look for all available models.

How to use

Installation

Follow Ryzen AI Installation to prepare the environment for Ryzen AI. Run the following script to install pre-requisites for this model.

pip install -r requirements.txt 

Data Preparation (optional: for accuracy evaluation)

  1. Download the benchmark(https://cv.snu.ac.kr/research/EDSR/benchmark.tar) dataset.
  2. Organize the dataset directory as follows:
└── dataset
     └── benchmark
          β”œβ”€β”€ Set5
               β”œβ”€β”€ HR
               |   β”œβ”€β”€ baby.png
               |   β”œβ”€β”€ ...
               └── LR_bicubic
                   └──X2
                      β”œβ”€β”€babyx2.png
                      β”œβ”€β”€ ...
          β”œβ”€β”€ Set14
          β”œβ”€β”€ ...    

Test & Evaluation

    parser = argparse.ArgumentParser(description='EDSR and MDSR')
    parser.add_argument('--onnx_path', type=str, default='SESR_int8.onnx',
                    help='onnx path')
    parser.add_argument('--image_path', default='test_data/test.png',
                    help='path of your image')
    parser.add_argument('--output_path', default='test_data/sr.png',
                    help='path of your image')
    parser.add_argument('--ipu', action='store_true',
                    help='use ipu')
    parser.add_argument('--provider_config', type=str, default=None,
                    help='provider config path')
    args = parser.parse_args()
    if args.ipu:
        providers = ["VitisAIExecutionProvider"]
        provider_options = [{"config_file": args.provider_config}]
    else:
        providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
        provider_options = None
   
    onnx_file_name = args.onnx_path
    image_path = args.image_path
    output_path = args.output_path

    ort_session = onnxruntime.InferenceSession(onnx_file_name,  providers=providers, provider_options=provider_options) 
    lr = cv2.imread(image_path)[np.newaxis,:,:,:].transpose((0,3,1,2)).astype(np.float32)
    sr = tiling_inference(ort_session, lr, 8, (56, 56))
    sr = np.clip(sr, 0, 255)
    sr = sr.squeeze().transpose((1,2,0)).astype(np.uint8)
    sr = cv2.imwrite(output_path, sr)
  • Run inference for a single image
python one_image_inference.py --onnx_path SESR_int8.onnx --image_path /Path/To/Your/Image --ipu --provider_config Path/To/vaip_config.json

Note: vaip_config.json is located at the setup package of Ryzen AI (refer to Installation)

  • Test accuracy of the quantized model
python test.py --onnx_path SESR_int8.onnx --data_test Set5 --ipu --provider_config Path/To/vaip_config.json 

Performance

Method Scale Flops Set5
SESR-S (float) X2 10.22G 37.21
SESR-S (INT8) X2 10.22G 36.81
  • Note: the Flops is calculated with the input resolution is 256x256
@misc{bhardwaj2022collapsible,
      title={Collapsible Linear Blocks for Super-Efficient Super Resolution}, 
      author={Kartikeya Bhardwaj and Milos Milosavljevic and Liam O'Neil and Dibakar Gope and Ramon Matas and Alex Chalfin and Naveen Suda and Lingchuan Meng and Danny Loh},
      year={2022},
      eprint={2103.09404},
      archivePrefix={arXiv},
      primaryClass={eess.IV}
}