| ## Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution | |
| Yiwen Wang<sup>1</sup> | Ying Liang<sup>1</sup> | Yuxuan Zhang<sup>1</sup> | Xinning Chai<sup>1</sup> | Zhengxue Cheng<sup>1</sup> | Yingsheng Qin<sup>2</sup> | |
| | Yucai Yang<sup>2</sup> | Rong Xie<sup>1</sup> | Li Song<sup>1</sup> | |
| <sup>1</sup>Shanghai Jiao Tong University, China, <sup>2</sup>Transsion, China | |
| [paper address](https://huggingface.co/papers/2504.09887) | |
| All codes are released on [Github](https://github.com/Moonsofang/NTIRE-2025-SRlab) | |
| #### 🚩Accepted by CVPR2024 | |
| ## ⚙️ Dependencies and Installation | |
| ``` | |
| ## git clone this repository | |
| git clone https://huggingface.co/NGain/Medialab | |
| cd Medialab | |
| # create an environment with python >= 3.8 | |
| conda create -n medialab python=3.8 | |
| conda activate medialab | |
| pip install -r requirements.txt | |
| # or you can directly install the environment by following instruct | |
| conda env create -f medialab.yml | |
| conda activate medialab | |
| ``` | |
| ## 🚀 Quick Inference | |
| #### Step 1: Download the pretrained models | |
| - Download the pretrained SD-2-base models from [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-2-base) | |
| - Download the checkpoint, sam2.1_hiera_tiny, ram_swin_large and DAPE models from [GoogleDrive](https://drive.google.com/drive/folders/1Ce0D8R99t-fDQfACLc8SGvf3gzdMnTwT?usp=sharing). | |
| - or you can directly download these files in the repository. | |
| You can put the models into `preset/models`. | |
| #### Step 2: Prepare testing data | |
| You can put the testing images in the `preset/datasets/test_datasets`. | |
| #### Step 3: Running testing command | |
| ``` | |
| # for wild dataset | |
| python ./test_seesr_sam.py \ | |
| --pretrained_model_path ./preset/models/stable-diffusion-2-base \ | |
| --prompt '' \ | |
| --seesr_model_path ./preset/models/checkpoint-90000 \ | |
| --ram_ft_path ./preset/models/DAPE.pth \ | |
| --image_path ./preset/datasets/test_datasets/wild \ | |
| --output_dir your_output_dir_path/wild \ | |
| --start_point noise \ | |
| --num_inference_steps 50 \ | |
| --guidance_scale 14 \ | |
| --added_prompt "clean, high-resolution, 8k, ultra-detailed, ultra-realistic" \ | |
| --upscale 1 \ | |
| --process_size 512 | |
| # for synthetic dataset | |
| python ./test_seesr_sam.py \ | |
| --pretrained_model_path ./preset/models/stable-diffusion-2-base \ | |
| --prompt '' \ | |
| --seesr_model_path ./preset/models/checkpoint-90000 \ | |
| --ram_ft_path ./preset/models/DAPE.pth \ | |
| --image_path ./preset/datasets/test_datasets/synthetic \ | |
| --output_dir your_output_dir_path/synthetic \ | |
| --start_point noise \ | |
| --num_inference_steps 50 \ | |
| --guidance_scale 0.9 \ | |
| --upscale 4 \ | |
| --process_size 512 | |
| ``` | |
| More details are [here](asserts/hyp.md) | |
| ## 🌈 Train | |
| Will release soon. | |
| ## ❤️ Acknowledgments | |
| This project is based on [diffusers](https://github.com/huggingface/diffusers) and [SeeSR](https://github.com/cswry/SeeSR). Some codes are brought from [PASD](https://github.com/yangxy/PASD), [RAM](https://github.com/xinyu1205/recognize-anything) and [SAM2](https://github.com/facebookresearch/sam2)). Thanks for their awesome works. We also pay tribute to the pioneering work of [StableSR](https://github.com/IceClear/StableSR). | |
| ## 📧 Contact | |
| If you have any questions, please feel free to contact: `forest726@sjtu.edu.cn` | |
| ## 🎫 License | |
| This project and related weights are released under the [Apache 2.0 license](LICENSE). | |
| <details> | |
| <summary>statistics</summary> | |
|  | |
| </details> | |