## Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution Yiwen Wang1 | Ying Liang1 | Yuxuan Zhang1 | Xinning Chai1 | Zhengxue Cheng1 | Yingsheng Qin2 | Yucai Yang2 | Rong Xie1 | Li Song1 1Shanghai Jiao Tong University, China, 2Transsion, China [paper address](https://huggingface.co/papers/2504.09887) All codes are released on [Github](https://github.com/Moonsofang/NTIRE-2025-SRlab) #### 🚩Accepted by CVPR2024 ## ⚙️ Dependencies and Installation ``` ## git clone this repository git clone https://huggingface.co/NGain/Medialab cd Medialab # create an environment with python >= 3.8 conda create -n medialab python=3.8 conda activate medialab pip install -r requirements.txt # or you can directly install the environment by following instruct conda env create -f medialab.yml conda activate medialab ``` ## 🚀 Quick Inference #### Step 1: Download the pretrained models - Download the pretrained SD-2-base models from [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-2-base) - Download the checkpoint, sam2.1_hiera_tiny, ram_swin_large and DAPE models from [GoogleDrive](https://drive.google.com/drive/folders/1Ce0D8R99t-fDQfACLc8SGvf3gzdMnTwT?usp=sharing). - or you can directly download these files in the repository. You can put the models into `preset/models`. #### Step 2: Prepare testing data You can put the testing images in the `preset/datasets/test_datasets`. #### Step 3: Running testing command ``` # for wild dataset python ./test_seesr_sam.py \ --pretrained_model_path ./preset/models/stable-diffusion-2-base \ --prompt '' \ --seesr_model_path ./preset/models/checkpoint-90000 \ --ram_ft_path ./preset/models/DAPE.pth \ --image_path ./preset/datasets/test_datasets/wild \ --output_dir your_output_dir_path/wild \ --start_point noise \ --num_inference_steps 50 \ --guidance_scale 14 \ --added_prompt "clean, high-resolution, 8k, ultra-detailed, ultra-realistic" \ --upscale 1 \ --process_size 512 # for synthetic dataset python ./test_seesr_sam.py \ --pretrained_model_path ./preset/models/stable-diffusion-2-base \ --prompt '' \ --seesr_model_path ./preset/models/checkpoint-90000 \ --ram_ft_path ./preset/models/DAPE.pth \ --image_path ./preset/datasets/test_datasets/synthetic \ --output_dir your_output_dir_path/synthetic \ --start_point noise \ --num_inference_steps 50 \ --guidance_scale 0.9 \ --upscale 4 \ --process_size 512 ``` More details are [here](asserts/hyp.md) ## 🌈 Train Will release soon. ## ❤️ Acknowledgments This project is based on [diffusers](https://github.com/huggingface/diffusers) and [SeeSR](https://github.com/cswry/SeeSR). Some codes are brought from [PASD](https://github.com/yangxy/PASD), [RAM](https://github.com/xinyu1205/recognize-anything) and [SAM2](https://github.com/facebookresearch/sam2)). Thanks for their awesome works. We also pay tribute to the pioneering work of [StableSR](https://github.com/IceClear/StableSR). ## 📧 Contact If you have any questions, please feel free to contact: `forest726@sjtu.edu.cn` ## 🎫 License This project and related weights are released under the [Apache 2.0 license](LICENSE).
statistics ![visitors](https://visitor-badge.laobi.icu/badge?page_id=cswry/SeeSR)