|
|
--- |
|
|
language: "en" |
|
|
license: "apache-2.0" |
|
|
tags: |
|
|
- text-to-image |
|
|
- stable-diffusion |
|
|
- diffusion |
|
|
- lora |
|
|
datasets: |
|
|
- custom |
|
|
library_name: "diffusers" |
|
|
pipeline_tag: "text-to-image" |
|
|
--- |
|
|
|
|
|
|
|
|
# GradSPO: A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models |
|
|
|
|
|
|
|
|
This repository provides **public LoRA checkpoints trained with GradSPO** for **Stable Diffusion v1.5** and **SDXL**. |
|
|
|
|
|
**GradSPO** reframes **stepwise preference optimization (SPO)** as learning from **noisy reward signals**, explicitly reducing this noise through **gradient guidance**. This results in **stronger reward signals** and achieves **improved preference alignment**. |
|
|
|
|
|
All released checkpoints are **LoRA weights only** and must be loaded on top of their corresponding base models. |
|
|
|
|
|
The official training code is available at: |
|
|
https://github.com/JoshuaTTJ/GradSPO |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
### SDXL (LoRA) |
|
|
|
|
|
```python |
|
|
from diffusers import StableDiffusionXLPipeline |
|
|
import torch |
|
|
|
|
|
pipe = StableDiffusionXLPipeline.from_pretrained( |
|
|
"stabilityai/stable-diffusion-xl-base-1.0", |
|
|
torch_dtype=torch.float16, |
|
|
) |
|
|
|
|
|
pipe.load_lora_weights("./sd1_5") |
|
|
|
|
|
pipe = pipe.to("cuda") |
|
|
|
|
|
prompt = "A cat holding a sign that says hello world" |
|
|
|
|
|
generator = torch.Generator(device="cuda").manual_seed(42) |
|
|
image = pipe( |
|
|
prompt=prompt, |
|
|
guidance_scale=5.0, |
|
|
num_inference_steps=20, |
|
|
generator=generator, |
|
|
output_type="pil", |
|
|
).images[0] |
|
|
|
|
|
image.save("img_sdxl.png") |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
### Stable Diffusion v1.5 (LoRA) |
|
|
|
|
|
```python |
|
|
from diffusers import StableDiffusionPipeline |
|
|
import torch |
|
|
|
|
|
pipe = StableDiffusionPipeline.from_pretrained( |
|
|
"sd-legacy/stable-diffusion-v1-5", |
|
|
torch_dtype=torch.float16, |
|
|
) |
|
|
|
|
|
pipe.load_lora_weights("./sdxl") |
|
|
|
|
|
pipe = pipe.to("cuda") |
|
|
|
|
|
prompt = "a photo of a cat" |
|
|
|
|
|
generator = torch.Generator(device="cuda").manual_seed(42) |
|
|
image = pipe( |
|
|
prompt=prompt, |
|
|
guidance_scale=5.0, |
|
|
num_inference_steps=20, |
|
|
generator=generator, |
|
|
output_type="pil", |
|
|
).images[0] |
|
|
|
|
|
image.save("img_sd15.png") |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you find GradSPO useful in your research, please consider citing our work: |
|
|
|
|
|
```bibtex |
|
|
@inproceedings{ |
|
|
tee2025a, |
|
|
title={A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models}, |
|
|
author={Joshua Tian Jin Tee and Hee Suk Yoon and Abu Hanif Muhammad Syarubany and Eunseop Yoon and Chang D. Yoo}, |
|
|
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}, |
|
|
year={2025}, |
|
|
url={https://openreview.net/forum?id=d6lIOnvOX2} |
|
|
} |
|
|
``` |