Text-to-Image
Diffusers
English
stable-diffusion-3
flow-matching
inference-time-alignment
preference-optimization
pg-map
ug-fm
neurips-2026
Instructions to use sophialan/pg-map-sd3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use sophialan/pg-map-sd3 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("sophialan/pg-map-sd3", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
| language: | |
| - en | |
| license: mit | |
| library_name: diffusers | |
| tags: | |
| - text-to-image | |
| - stable-diffusion-3 | |
| - flow-matching | |
| - inference-time-alignment | |
| - preference-optimization | |
| - pg-map | |
| - ug-fm | |
| - neurips-2026 | |
| pipeline_tag: text-to-image | |
| # PG-MAP / UG-FM for Stable Diffusion 3.5-medium | |
| Custom diffusers pipeline for **UG-FM** — the flow-matching reduction of PG-MAP on SD3.5-medium. Defaults to the paper's headline configuration (data-side gate, $K_{UG}=4$, $\eta_z=0.1$, full backprop through the velocity prediction) which delivers **91.9% PickScore / 75.7% HPS win-rates** against the static rectified-flow baseline on PartiPrompts ($n=1632$, seed 123). | |
| NeurIPS 2026 — see [github.com/sophialanlan/PG-MAP](https://github.com/sophialanlan/PG-MAP) for the paper, full configs, and reproduction scripts. | |
| ## Install | |
| ```bash | |
| pip install pg-map | |
| # or | |
| pip install git+https://github.com/sophialanlan/PG-MAP | |
| ``` | |
| You also need to accept the Stability AI Community License for the SD3.5 weights on [huggingface.co/stabilityai/stable-diffusion-3.5-medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) before the first load. | |
| ## Usage | |
| ```python | |
| from diffusers import DiffusionPipeline | |
| from pgmap import FrozenRewardModel | |
| import torch | |
| pipe = DiffusionPipeline.from_pretrained( | |
| "stabilityai/stable-diffusion-3.5-medium", | |
| custom_pipeline="sophialan/pg-map-sd3", | |
| torch_dtype=torch.float16, | |
| ).to("cuda") | |
| reward = FrozenRewardModel("pickscore", device="cuda") | |
| # UG-FM (default): 91.9% PickScore configuration | |
| image = pipe( | |
| "a phoenix rising from ashes, vivid orange and red feathers", | |
| reward_model=reward, | |
| ).images[0] | |
| ``` | |
| For the full PG-MAP-FM (joint c + z_t with flow consistency + Gaussian priors + reward), pass `pg_map_config` with `optimize_c=True`: | |
| ```python | |
| from pgmap import sdxl_defaults | |
| from dataclasses import replace | |
| cfg = sdxl_defaults() # starting point | |
| cfg = replace(cfg, optimize_c=True, optimize_z=True) | |
| image = pipe("a phoenix rising from ashes", pg_map_config=cfg).images[0] | |
| ``` | |
| ## Why UG-FM is the right default for flow matching | |
| Per paper §3.2, on SD3.5 the optimal active set collapses to $\{z_t\}$ alone at data-side steps for two transport-specific reasons: | |
| 1. **Conditioning capacity.** SD3.5's concatenated CLIP-L / CLIP-G / T5-XXL representation has ~1.4 M optimisable parameters, so a unit-normalised c-gradient is spread too thinly to move any single direction. | |
| 2. **Local Euler amplification.** A noise-side perturbation traverses ~25 factors of $I + \Delta t_j\,\partial_z v_\theta$ and grows 5–50×, while a data-side perturbation has only 1–3 remaining factors and stays bounded (sub-pixel mean RMSE $0.61/255$). | |
| ## Paper headline (SD3.5-medium, PartiPrompts $n=1632$, seed 123) | |
| | Method | PickScore | HPS | Aesthetic | CLIP | | |
| |---|---|---|---|---| | |
| | Static baseline | 50.0% | 50.0% | 50.0% | 50.0% | | |
| | FlowChef (always-on, K=1) | 82.4% | 68.1% | 49.7% | 53.9% | | |
| | FlowChef (gating-matched) | 75.0% | 62.5% | 46.9% | 52.9% | | |
| | **UG-FM (Ours)** | **91.9%** | **75.7%** | **51.7%** | **54.2%** | | |
| Win-rate vs. same-seed static baseline. The 16.9 pp PickScore gap between UG-FM and gating-matched FlowChef isolates the **full backprop through $v_\theta$** axis — FlowChef's gradient skipping (`with torch.no_grad(): v = v_theta(...)`) discards the Jacobian factor $I - (1-t)\,\partial_z v_\theta$ which is load-bearing. | |
| ## Citation | |
| ```bibtex | |
| @inproceedings{sun2026pgmap, | |
| title={{PG-MAP}: Joint {MAP} Optimization for Inference-Time Alignment of Diffusion and Flow-Matching Models}, | |
| author={Sun, Ruolan and Polak, Pawel}, | |
| booktitle={Advances in Neural Information Processing Systems (NeurIPS)}, | |
| year={2026} | |
| } | |
| ``` | |
| ## License | |
| MIT (see [LICENSE](https://github.com/sophialanlan/PG-MAP/blob/main/LICENSE)). SD3.5 weights are under the Stability AI Community License. | |