Image-to-Image
Diffusers
Safetensors
English
image-restoration
all-in-one
diffusion
flow-matching
mllm
flux
qwen2.5-vl
siglip2
low-level-vision
Instructions to use David0219/FAPEIR_Uniworld with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use David0219/FAPEIR_Uniworld with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("David0219/FAPEIR_Uniworld", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| language: | |
| - en | |
| tags: | |
| - image-restoration | |
| - all-in-one | |
| - diffusion | |
| - flow-matching | |
| - mllm | |
| - flux | |
| - qwen2.5-vl | |
| - siglip2 | |
| - low-level-vision | |
| pipeline_tag: image-to-image | |
| <p align="center"> | |
| <img src="https://raw.githubusercontent.com/Programmergg/FAPE-IR/main/figs/logo.png" width="120"> | |
| </p> | |
| # FAPEIR_Uniworld β Initial Weights for FAPE-IR | |
| Initial weights for **FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration**. | |
| > [π Paper (arXiv 2511.14099)](https://arxiv.org/abs/2511.14099)   | |
| > [π» Code](https://github.com/Programmergg/FAPE-IR)   | |
| > [ποΈ Trainset](https://huggingface.co/datasets/David0219/FAPE-IR-Training)   | |
| > [π§ͺ Testset](https://huggingface.co/datasets/David0219/FAPE-IR-Testing) | |
| --- | |
| ## π‘ What This Repo Is | |
| This repository releases the **initial weights** required to *start training* FAPE-IR β i.e. all pretrained components consumed by the YAML config | |
| ``` | |
| scripts/denoiser/flux_qwen2p5vl_7b_vlm_512.yaml | |
| ``` | |
| in the FAPE-IR codebase. Concretely it bundles: | |
| * the **UniWorld-V1** initialization (Qwen2.5-VL-7B-Instruct + FLUX.1-dev re-organized weights), | |
| * the **SigLIP-v2** encoder used by the executor, | |
| * a small set of **projection / connector** weights (`mlp2`, `mlp3`, SigLIPβFLUX redux), | |
| * a **VGG** checkpoint used by the LPIPS loss. | |
| > β οΈ This is **NOT** the post-training checkpoint reported in the paper. | |
| --- | |
| ## π File Layout | |
| After downloading, the repository is meant to be placed under `FAPE-IR/weights/` exactly as below: | |
| ```text | |
| weights/ | |
| βββ flux/ # FLUX.1-dev backbone (re-organized) | |
| βββ siglip/ # SigLIP-v2 encoder | |
| βββ uniworld/ # UniWorld-V1 (Qwen2.5-VL-7B-Instruct + denoiser projection) | |
| βββ denoise_projector_params.bin # planner-token β denoiser projector (mlp2) | |
| βββ flux-redux-siglipv2-512.bin # SigLIP-v2 β FLUX redux projector | |
| βββ vae_projector_only.bin # VAE high/low-frequency projector (mlp3) | |
| βββ vgg.pth # VGG weights for LPIPS loss | |
| ``` | |
| These names match one-to-one with the fields of the YAML config: | |
| ```yaml | |
| model_config: | |
| pretrained_lvlm_name_or_path: weights/uniworld | |
| pretrained_denoiser_name_or_path: weights/flux | |
| pretrained_siglip_name_or_path: weights/siglip | |
| pretrained_mlp2_path: weights/denoise_projector_params.bin | |
| pretrained_mlp3_path: weights/vae_projector_only.bin | |
| pretrained_siglip_mlp_path: weights/flux-redux-siglipv2-512.bin | |
| training_config: | |
| lpips_weights_path: weights/vgg.pth | |
| ``` | |
| If you change the layout, remember to update the YAML accordingly. | |
| --- | |
| ## β¬οΈ Download | |
| ```bash | |
| # inside the FAPE-IR project root | |
| mkdir -p weights | |
| huggingface-cli download David0219/FAPEIR_Uniworld --local-dir ./weights | |
| ``` | |
| Or in Python: | |
| ```python | |
| from huggingface_hub import snapshot_download | |
| snapshot_download( | |
| repo_id="David0219/FAPEIR_Uniworld", | |
| local_dir="./weights", | |
| local_dir_use_symlinks=False, | |
| ) | |
| ``` | |
| --- | |
| ## π Intended Use & Limitations | |
| **Intended use.** Research on All-in-One image restoration with an *MLLM-as-planner + diffusion-as-executor* paradigm; reproducing or extending FAPE-IR; ablating individual components (LoRA-MoE routing, frequency regularization, adversarial training). | |
| **Limitations.** | |
| * Training requires substantial GPU memory because the executor is FLUX.1-dev (12B-class) and the planner is Qwen2.5-VL-7B-Instruct. | |
| * These are **initial weights only** β running inference with them directly will *not* reproduce FAPE-IR's reported quality. Train first. | |
| * The base models (FLUX.1-dev, Qwen2.5-VL, SigLIP-v2) keep their original licenses; in particular FLUX.1-dev is non-commercial. Users must comply with each license individually. | |
| --- | |
| ## π Citation | |
| ```bibtex | |
| @article{liu2025fape, | |
| title = {FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration}, | |
| author = {Liu, Jingren and Xu, Shuning and Yang, Qirui and Wang, Yun and Chen, Xiangyu and Ji, Zhong}, | |
| journal = {arXiv preprint arXiv:2511.14099}, | |
| year = {2025} | |
| } | |
| ``` | |
| --- | |
| ## π License & Acknowledgement | |
| Apache-2.0 for the connector / projector weights released here. The bundled **UniWorld-V1**, **FLUX.1-dev**, **Qwen2.5-VL-7B-Instruct**, **SigLIP-v2** and **VGG** weights retain their **original licenses**, which users must respect. | |
| We thank the teams behind [UniWorld](https://github.com/PKU-YuanGroup/UniWorld), [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), and [SigLIP-v2](https://huggingface.co/google/siglip2-so400m-patch14-384) for open-sourcing their work. | |