File size: 2,515 Bytes
023e499
 
 
 
 
 
 
 
 
 
 
 
 
 
 
848488f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
language: "en"
license: "apache-2.0"
tags:
  - text-to-image
  - stable-diffusion
  - diffusion
  - lora
datasets:
  - custom
library_name: "diffusers"
pipeline_tag: "text-to-image"
---


# GradSPO: A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models


This repository provides **public LoRA checkpoints trained with GradSPO** for **Stable Diffusion v1.5** and **SDXL**.

**GradSPO** reframes **stepwise preference optimization (SPO)** as learning from **noisy reward signals**, explicitly reducing this noise through **gradient guidance**. This results in **stronger reward signals** and achieves **improved preference alignment**.

All released checkpoints are **LoRA weights only** and must be loaded on top of their corresponding base models.

The official training code is available at:  
https://github.com/JoshuaTTJ/GradSPO

---

## Usage

### SDXL (LoRA)

```python
from diffusers import StableDiffusionXLPipeline
import torch

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
)

pipe.load_lora_weights("./sd1_5")

pipe = pipe.to("cuda")

prompt = "A cat holding a sign that says hello world"

generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
    prompt=prompt,
    guidance_scale=5.0,
    num_inference_steps=20,
    generator=generator,
    output_type="pil",
).images[0]

image.save("img_sdxl.png")
```

---

### Stable Diffusion v1.5 (LoRA)

```python
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "sd-legacy/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
)

pipe.load_lora_weights("./sdxl")

pipe = pipe.to("cuda")

prompt = "a photo of a cat"

generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
    prompt=prompt,
    guidance_scale=5.0,
    num_inference_steps=20,
    generator=generator,
    output_type="pil",
).images[0]

image.save("img_sd15.png")
```

---

## Citation

If you find GradSPO useful in your research, please consider citing our work:

```bibtex
@inproceedings{
tee2025a,
title={A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models},
author={Joshua Tian Jin Tee and Hee Suk Yoon and Abu Hanif Muhammad Syarubany and Eunseop Yoon and Chang D. Yoo},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=d6lIOnvOX2}
}
```