pg-map-sd3 / README.md
sophialan's picture
PG-MAP NeurIPS 2026 — v1.0 custom-pipeline release
aacda29 verified
---
language:
- en
license: mit
library_name: diffusers
tags:
- text-to-image
- stable-diffusion-3
- flow-matching
- inference-time-alignment
- preference-optimization
- pg-map
- ug-fm
- neurips-2026
pipeline_tag: text-to-image
---
# PG-MAP / UG-FM for Stable Diffusion 3.5-medium
Custom diffusers pipeline for **UG-FM** — the flow-matching reduction of PG-MAP on SD3.5-medium. Defaults to the paper's headline configuration (data-side gate, $K_{UG}=4$, $\eta_z=0.1$, full backprop through the velocity prediction) which delivers **91.9% PickScore / 75.7% HPS win-rates** against the static rectified-flow baseline on PartiPrompts ($n=1632$, seed 123).
NeurIPS 2026 — see [github.com/sophialanlan/PG-MAP](https://github.com/sophialanlan/PG-MAP) for the paper, full configs, and reproduction scripts.
## Install
```bash
pip install pg-map
# or
pip install git+https://github.com/sophialanlan/PG-MAP
```
You also need to accept the Stability AI Community License for the SD3.5 weights on [huggingface.co/stabilityai/stable-diffusion-3.5-medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) before the first load.
## Usage
```python
from diffusers import DiffusionPipeline
from pgmap import FrozenRewardModel
import torch
pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-3.5-medium",
custom_pipeline="sophialan/pg-map-sd3",
torch_dtype=torch.float16,
).to("cuda")
reward = FrozenRewardModel("pickscore", device="cuda")
# UG-FM (default): 91.9% PickScore configuration
image = pipe(
"a phoenix rising from ashes, vivid orange and red feathers",
reward_model=reward,
).images[0]
```
For the full PG-MAP-FM (joint c + z_t with flow consistency + Gaussian priors + reward), pass `pg_map_config` with `optimize_c=True`:
```python
from pgmap import sdxl_defaults
from dataclasses import replace
cfg = sdxl_defaults() # starting point
cfg = replace(cfg, optimize_c=True, optimize_z=True)
image = pipe("a phoenix rising from ashes", pg_map_config=cfg).images[0]
```
## Why UG-FM is the right default for flow matching
Per paper §3.2, on SD3.5 the optimal active set collapses to $\{z_t\}$ alone at data-side steps for two transport-specific reasons:
1. **Conditioning capacity.** SD3.5's concatenated CLIP-L / CLIP-G / T5-XXL representation has ~1.4 M optimisable parameters, so a unit-normalised c-gradient is spread too thinly to move any single direction.
2. **Local Euler amplification.** A noise-side perturbation traverses ~25 factors of $I + \Delta t_j\,\partial_z v_\theta$ and grows 5–50×, while a data-side perturbation has only 1–3 remaining factors and stays bounded (sub-pixel mean RMSE $0.61/255$).
## Paper headline (SD3.5-medium, PartiPrompts $n=1632$, seed 123)
| Method | PickScore | HPS | Aesthetic | CLIP |
|---|---|---|---|---|
| Static baseline | 50.0% | 50.0% | 50.0% | 50.0% |
| FlowChef (always-on, K=1) | 82.4% | 68.1% | 49.7% | 53.9% |
| FlowChef (gating-matched) | 75.0% | 62.5% | 46.9% | 52.9% |
| **UG-FM (Ours)** | **91.9%** | **75.7%** | **51.7%** | **54.2%** |
Win-rate vs. same-seed static baseline. The 16.9 pp PickScore gap between UG-FM and gating-matched FlowChef isolates the **full backprop through $v_\theta$** axis — FlowChef's gradient skipping (`with torch.no_grad(): v = v_theta(...)`) discards the Jacobian factor $I - (1-t)\,\partial_z v_\theta$ which is load-bearing.
## Citation
```bibtex
@inproceedings{sun2026pgmap,
title={{PG-MAP}: Joint {MAP} Optimization for Inference-Time Alignment of Diffusion and Flow-Matching Models},
author={Sun, Ruolan and Polak, Pawel},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2026}
}
```
## License
MIT (see [LICENSE](https://github.com/sophialanlan/PG-MAP/blob/main/LICENSE)). SD3.5 weights are under the Stability AI Community License.