PG-MAP NeurIPS 2026 — v1.0 custom-pipeline release

aacda29 verified 9 days ago

4 kB

	---
	language:
	- en
	license: mit
	library_name: diffusers
	tags:
	- text-to-image
	- stable-diffusion-3
	- flow-matching
	- inference-time-alignment
	- preference-optimization
	- pg-map
	- ug-fm
	- neurips-2026
	pipeline_tag: text-to-image
	---

	# PG-MAP / UG-FM for Stable Diffusion 3.5-medium

	Custom diffusers pipeline for UG-FM — the flow-matching reduction of PG-MAP on SD3.5-medium. Defaults to the paper's headline configuration (data-side gate, $K_{UG}=4$, $\eta_z=0.1$, full backprop through the velocity prediction) which delivers 91.9% PickScore / 75.7% HPS win-rates against the static rectified-flow baseline on PartiPrompts ($n=1632$, seed 123).

	NeurIPS 2026 — see [github.com/sophialanlan/PG-MAP](https://github.com/sophialanlan/PG-MAP) for the paper, full configs, and reproduction scripts.

	## Install

	```bash
	pip install pg-map
	# or
	pip install git+https://github.com/sophialanlan/PG-MAP
	```

	You also need to accept the Stability AI Community License for the SD3.5 weights on [huggingface.co/stabilityai/stable-diffusion-3.5-medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) before the first load.

	## Usage

	```python
	from diffusers import DiffusionPipeline
	from pgmap import FrozenRewardModel
	import torch

	pipe = DiffusionPipeline.from_pretrained(
	"stabilityai/stable-diffusion-3.5-medium",
	custom_pipeline="sophialan/pg-map-sd3",
	torch_dtype=torch.float16,
	).to("cuda")

	reward = FrozenRewardModel("pickscore", device="cuda")

	# UG-FM (default): 91.9% PickScore configuration
	image = pipe(
	"a phoenix rising from ashes, vivid orange and red feathers",
	reward_model=reward,
	).images[0]
	```

	For the full PG-MAP-FM (joint c + z_t with flow consistency + Gaussian priors + reward), pass `pg_map_config` with `optimize_c=True`:

	```python
	from pgmap import sdxl_defaults
	from dataclasses import replace

	cfg = sdxl_defaults() # starting point
	cfg = replace(cfg, optimize_c=True, optimize_z=True)
	image = pipe("a phoenix rising from ashes", pg_map_config=cfg).images[0]
	```

	## Why UG-FM is the right default for flow matching

	Per paper §3.2, on SD3.5 the optimal active set collapses to $\{z_t\}$ alone at data-side steps for two transport-specific reasons:

	1. Conditioning capacity. SD3.5's concatenated CLIP-L / CLIP-G / T5-XXL representation has ~1.4 M optimisable parameters, so a unit-normalised c-gradient is spread too thinly to move any single direction.
	2. Local Euler amplification. A noise-side perturbation traverses ~25 factors of $I + \Delta t_j\,\partial_z v_\theta$ and grows 5–50×, while a data-side perturbation has only 1–3 remaining factors and stays bounded (sub-pixel mean RMSE $0.61/255$).

	## Paper headline (SD3.5-medium, PartiPrompts $n=1632$, seed 123)

	\| Method \| PickScore \| HPS \| Aesthetic \| CLIP \|
	\|---\|---\|---\|---\|---\|
	\| Static baseline \| 50.0% \| 50.0% \| 50.0% \| 50.0% \|
	\| FlowChef (always-on, K=1) \| 82.4% \| 68.1% \| 49.7% \| 53.9% \|
	\| FlowChef (gating-matched) \| 75.0% \| 62.5% \| 46.9% \| 52.9% \|
	\| UG-FM (Ours) \| 91.9% \| 75.7% \| 51.7% \| 54.2% \|

	Win-rate vs. same-seed static baseline. The 16.9 pp PickScore gap between UG-FM and gating-matched FlowChef isolates the full backprop through $v_\theta$ axis — FlowChef's gradient skipping (`with torch.no_grad(): v = v_theta(...)`) discards the Jacobian factor $I - (1-t)\,\partial_z v_\theta$ which is load-bearing.

	## Citation

	```bibtex
	@inproceedings{sun2026pgmap,
	title={{PG-MAP}: Joint {MAP} Optimization for Inference-Time Alignment of Diffusion and Flow-Matching Models},
	author={Sun, Ruolan and Polak, Pawel},
	booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
	year={2026}
	}
	```

	## License

	MIT (see [LICENSE](https://github.com/sophialanlan/PG-MAP/blob/main/LICENSE)). SD3.5 weights are under the Stability AI Community License.