GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution

¹The Hong Kong Polytechnic University, ²OPPO Research Institute

This repository contains the weights for GDPO-SR, presented in the paper GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution.

⏰ Update

2026.3.19: Paper is released on ArXiv.
2026.3.12: The training code and testing code are released.
2026.3.10: The repo is released.

⚙ Dependencies and Installation

## git clone this repository
git clone https://github.com/Joyies/GDPO.git
cd GDPO


# create an environment
conda create -n GDPO python=3.10
conda activate GDPO
pip install --upgrade pip
pip install -r requirements.txt

🏂 Quick Inference

Step 1: Download the pretrained models

Download the model weights and put the model weights in the ckp/:

Step 2: Prepare testing data and run testing command

You can modify input_path and output_path to run testing command. The input_path is the path of the test image and the output_path is the path where the output images are saved.

CUDA_VISIBLE_DEVICES=0, python GDPOSR/inferences/test.py \
--input_path test_LR \
--output_path experiment/GDPOSR \
--pretrained_path ckp/GDPOSR \
--pretrained_model_name_or_path stable-diffusion-2-1-base \
--ram_ft_path ckp/DAPE.pth \
--negprompt 'dotted, noise, blur, lowres, smooth' \
--prompt 'clean, high-resolution, 8k' \
--upscale 1 \
--time_step=100 \
--time_step_noise=250

bash scripts/test/test.sh

🚄 Training Phase

Step1: Prepare training data

Download the LSIDR dataset and FFHQ dataset and crop multiple 512×512 image patches using a sliding window with a stride of 64 pixels;

Step2: Train NAOSD.

bash scripts/train/train_NAOSD.sh

The hyperparameters in train_NAOSD.sh can be modified to suit different experimental settings. Besides, after training with NAOSD, you can use GDPOSR/mergelora.py to merge the LoRA into the UNet and VAE as base model for subsequent reinforcement learning training and inference.

Step3: Train GDPO-SR

bash scripts/train/train_GDPOSR.sh

The hyperparameters in train_GDPOSR.sh can be modified to suit different experimental settings. Besides, after training with GDPO-SR, you can use GDPOSR/mergelora.py to merge the LoRA into the UNet for subsequent inference.

🔗 Citations

@article{yi2026gdpo,
  title={GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution},
  author={Yi, Qiaosi and Li, Shuai and Wu, Rongyuan and Sun, Lingchen and Zhang, Zhengqiang and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2026}
}

©️ License

This project is released under the Apache 2.0 license.

📧 Contact

If you have any questions, please contact: qiaosiyijoyies@gmail.com

Downloads last month: -

Paper for Joypop/GDPO

GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution

Paper • 2603.16769 • Published Mar 16