---
license: apache-2.0
language:
- en
tags:
- image-restoration
- all-in-one
- diffusion
- flow-matching
- mllm
- flux
- qwen2.5-vl
- siglip2
- low-level-vision
pipeline_tag: image-to-image
---

<p align="center">
  <img src="https://raw.githubusercontent.com/Programmergg/FAPE-IR/main/figs/logo.png" width="120">
</p>

# FAPEIR_Uniworld — Initial Weights for FAPE-IR

Initial weights for **FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration**.

> [📄 Paper (arXiv 2511.14099)](https://arxiv.org/abs/2511.14099) &emsp;
> [💻 Code](https://github.com/Programmergg/FAPE-IR) &emsp;
> [🏋️ Trainset](https://huggingface.co/datasets/David0219/FAPE-IR-Training) &emsp;
> [🧪 Testset](https://huggingface.co/datasets/David0219/FAPE-IR-Testing)

---

## 💡 What This Repo Is

This repository releases the **initial weights** required to *start training* FAPE-IR — i.e. all pretrained components consumed by the YAML config

```
scripts/denoiser/flux_qwen2p5vl_7b_vlm_512.yaml
```

in the FAPE-IR codebase. Concretely it bundles:

* the **UniWorld-V1** initialization (Qwen2.5-VL-7B-Instruct + FLUX.1-dev re-organized weights),
* the **SigLIP-v2** encoder used by the executor,
* a small set of **projection / connector** weights (`mlp2`, `mlp3`, SigLIP→FLUX redux),
* a **VGG** checkpoint used by the LPIPS loss.

> ⚠️ This is **NOT** the post-training checkpoint reported in the paper.
---

## 📂 File Layout

After downloading, the repository is meant to be placed under `FAPE-IR/weights/` exactly as below:

```text
weights/
├── flux/                                  # FLUX.1-dev backbone (re-organized)
├── siglip/                                # SigLIP-v2 encoder
├── uniworld/                              # UniWorld-V1 (Qwen2.5-VL-7B-Instruct + denoiser projection)
├── denoise_projector_params.bin           # planner-token → denoiser projector  (mlp2)
├── flux-redux-siglipv2-512.bin            # SigLIP-v2 → FLUX redux projector
├── vae_projector_only.bin                 # VAE high/low-frequency projector    (mlp3)
└── vgg.pth                                # VGG weights for LPIPS loss
```

These names match one-to-one with the fields of the YAML config:

```yaml
model_config:
  pretrained_lvlm_name_or_path:    weights/uniworld
  pretrained_denoiser_name_or_path: weights/flux
  pretrained_siglip_name_or_path:  weights/siglip
  pretrained_mlp2_path:            weights/denoise_projector_params.bin
  pretrained_mlp3_path:            weights/vae_projector_only.bin
  pretrained_siglip_mlp_path:      weights/flux-redux-siglipv2-512.bin

training_config:
  lpips_weights_path:              weights/vgg.pth
```

If you change the layout, remember to update the YAML accordingly.

---

## ⬇️ Download

```bash
# inside the FAPE-IR project root
mkdir -p weights
huggingface-cli download David0219/FAPEIR_Uniworld --local-dir ./weights
```

Or in Python:

```python
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="David0219/FAPEIR_Uniworld",
    local_dir="./weights",
    local_dir_use_symlinks=False,
)
```

---

## 📝 Intended Use & Limitations

**Intended use.** Research on All-in-One image restoration with an *MLLM-as-planner + diffusion-as-executor* paradigm; reproducing or extending FAPE-IR; ablating individual components (LoRA-MoE routing, frequency regularization, adversarial training).

**Limitations.**

* Training requires substantial GPU memory because the executor is FLUX.1-dev (12B-class) and the planner is Qwen2.5-VL-7B-Instruct.
* These are **initial weights only** — running inference with them directly will *not* reproduce FAPE-IR's reported quality. Train first.
* The base models (FLUX.1-dev, Qwen2.5-VL, SigLIP-v2) keep their original licenses; in particular FLUX.1-dev is non-commercial. Users must comply with each license individually.

---

## 🔖 Citation

```bibtex
@article{liu2025fape,
  title   = {FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration},
  author  = {Liu, Jingren and Xu, Shuning and Yang, Qirui and Wang, Yun and Chen, Xiangyu and Ji, Zhong},
  journal = {arXiv preprint arXiv:2511.14099},
  year    = {2025}
}
```

---

## 📜 License & Acknowledgement

Apache-2.0 for the connector / projector weights released here. The bundled **UniWorld-V1**, **FLUX.1-dev**, **Qwen2.5-VL-7B-Instruct**, **SigLIP-v2** and **VGG** weights retain their **original licenses**, which users must respect.

We thank the teams behind [UniWorld](https://github.com/PKU-YuanGroup/UniWorld), [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), and [SigLIP-v2](https://huggingface.co/google/siglip2-so400m-patch14-384) for open-sourcing their work.