Upload README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
license_name: polyform-noncommercial-1.0.0
|
| 4 |
+
license_link: https://polyformproject.org/licenses/noncommercial/1.0.0
|
| 5 |
+
pipeline_tag: image-to-image
|
| 6 |
+
base_model: Manojb/stable-diffusion-2-1-base
|
| 7 |
+
tags:
|
| 8 |
+
- super-resolution
|
| 9 |
+
- image-super-resolution
|
| 10 |
+
- real-world-super-resolution
|
| 11 |
+
- rectified-flow
|
| 12 |
+
- consistency-models
|
| 13 |
+
- diffusion
|
| 14 |
+
- stable-diffusion
|
| 15 |
+
language:
|
| 16 |
+
- en
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# FlowSR — Fast Image Super-Resolution via Consistency Rectified Flow
|
| 20 |
+
|
| 21 |
+
This repository hosts the reproduced **model checkpoint** for **FlowSR**, a single-step real-world image super-resolution model based on the ICCV 2025 paper *"Fast Image Super-Resolution via Consistency Rectified Flow."*
|
| 22 |
+
|
| 23 |
+
FlowSR reformulates super-resolution as a **rectified flow** that bridges low-resolution (LR) and high-resolution (HR) images, and uses **HR-regularized consistency learning** with a **fast–slow time scheduling** strategy to deliver high-quality results in **as few as one inference step**.
|
| 24 |
+
|
| 25 |
+

|
| 26 |
+
|
| 27 |
+
- 📄 **Paper (ICCV 2025):** [openaccess.thecvf.com](https://openaccess.thecvf.com/content/ICCV2025/html/Xu_Fast_Image_Super-Resolution_via_Consistency_Rectified_Flow_ICCV_2025_paper.html)
|
| 28 |
+
- 📚 **arXiv:** [arxiv.org/abs/2605.12377](https://arxiv.org/abs/2605.12377)
|
| 29 |
+
- 💻 **Inference code:** [github.com/springXIACJ/FlowSR](https://github.com/springXIACJ/FlowSR) *(unofficial third-party implementation)*
|
| 30 |
+
|
| 31 |
+
## Files
|
| 32 |
+
|
| 33 |
+
- **`flowsr.safetensors`** — the model checkpoint. It stores LoRA adapter weights (rank 32) for the UNet on top of a *Stable Diffusion 2.1-base* backbone, together with the FlowSR-specific metadata needed to rebuild the adapters at load time.
|
| 34 |
+
|
| 35 |
+
## How to use
|
| 36 |
+
|
| 37 |
+
The checkpoint is consumed by the FlowSR inference package. Download the weights into a local `checkpoints/` directory:
|
| 38 |
+
|
| 39 |
+
```bash
|
| 40 |
+
pip install -U huggingface_hub
|
| 41 |
+
hf download chunjie-spring/FlowSR flowsr.safetensors --local-dir checkpoints
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
Then run single-image or folder inference (see the [inference repository](https://github.com/springXIACJ/FlowSR) for full setup):
|
| 45 |
+
|
| 46 |
+
```bash
|
| 47 |
+
python -m flowsr.infer \
|
| 48 |
+
--input path/to/lr.png \
|
| 49 |
+
--output outputs \
|
| 50 |
+
--checkpoint checkpoints/flowsr.safetensors
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
> **Hardware:** the model targets a CUDA GPU. A single image runs in roughly **0.14 s** at 4× upscaling to a 512×512 resolution on a modern GPU.
|
| 54 |
+
|
| 55 |
+
## Model details
|
| 56 |
+
|
| 57 |
+
- **Backbone:** Stable Diffusion 2.1-base (`Manojb/stable-diffusion-2-1-base`, a re-upload of the original `stabilityai/stable-diffusion-2-1-base` weights, which were removed from the Hub).
|
| 58 |
+
- **Scheduler:** `FlowMatchEulerDiscreteScheduler` (rectified flow).
|
| 59 |
+
- **Adapters:** PEFT LoRA, rank 32, injected into the UNet.
|
| 60 |
+
- **Default inference:** 1 step, scale ×4, `guidance_scale = 1.0`, wavelet color correction.
|
| 61 |
+
- **Training data:** LSDIR + the first 10K FFHQ face images, with LR–HR pairs synthesized via the Real-ESRGAN degradation pipeline; image-quality captions generated with Qwen2-VL.
|
| 62 |
+
|
| 63 |
+
## Evaluation
|
| 64 |
+
|
| 65 |
+
Quantitative comparison on **RealSR** and **DRealSR** (StableSR real-world test sets). FlowSR runs in a single step:
|
| 66 |
+
|
| 67 |
+
| Dataset | Steps | PSNR ↑ | SSIM ↑ | LPIPS ↓ | DISTS ↓ | FID ↓ | NIQE ↓ | MUSIQ ↑ | MANIQA ↑ | CLIPIQA ↑ |
|
| 68 |
+
| --- | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: |
|
| 69 |
+
| RealSR | 1 | 25.54 | 0.7434 | 0.2728 | 0.2013 | 112.60 | 5.28 | 69.22 | 0.6486 | 0.6701 |
|
| 70 |
+
| DRealSR | 1 | 28.50 | 0.7859 | 0.2975 | 0.2115 | 130.30 | 6.13 | 65.46 | 0.6172 | 0.7074 |
|
| 71 |
+
|
| 72 |
+
Metrics follow common SR conventions (PSNR/SSIM on the Y channel in YCbCr). Evaluation test sets: [`Iceclear/StableSR-TestSets`](https://huggingface.co/datasets/Iceclear/StableSR-TestSets).
|
| 73 |
+
|
| 74 |
+
## Limitations
|
| 75 |
+
|
| 76 |
+
- Trained for **4× real-world super-resolution**; other scales/degradations are out of distribution.
|
| 77 |
+
- Requires a GPU; CPU inference is not a supported path.
|
| 78 |
+
|
| 79 |
+
## License
|
| 80 |
+
|
| 81 |
+
This checkpoint is released under the **PolyForm Noncommercial License 1.0.0** for non-commercial research use. For commercial use, please contact the authors.
|
| 82 |
+
|
| 83 |
+
## Citation
|
| 84 |
+
|
| 85 |
+
If you find FlowSR useful, please cite the paper:
|
| 86 |
+
|
| 87 |
+
```bibtex
|
| 88 |
+
@inproceedings{xu2025fast,
|
| 89 |
+
title={Fast Image Super-Resolution via Consistency Rectified Flow},
|
| 90 |
+
author={Xu, Jiaqi and Li, Wenbo and Sun, Haoze and Li, Fan and Wang, Zhixin and Peng, Long and Ren, Jingjing and Yang, Haoran and Hu, Xiaowei and Pei, Renjing and Heng, Pheng-Ann},
|
| 91 |
+
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
|
| 92 |
+
pages={11755--11765},
|
| 93 |
+
year={2025}
|
| 94 |
+
}
|
| 95 |
+
```
|