File size: 3,893 Bytes

---
library_name: diffusers
pipeline_tag: text-to-image
tags:
- stable-diffusion-xl
- stable-diffusion
- text-to-image-models
- model-compression
- pruning
license: openrail++
---



# OBS-Diff Structured Pruning for Stable Diffusion-xl-base-1.0 

<div style="
    display: flex; 
    flex-wrap: wrap; 
    align-items: flex-start; 
    gap: 20px; 
    border: 1px solid #e0e0e0; 
    padding: 20px; 
    border-radius: 10px; 
    margin-bottom: 20px; 
    background-color: #fff;
">
  
  <div style="flex: 1; min-width: 280px; max-width: 100%;">
    <img src="teaser.jpg" alt="OBS-Diff" style="width: 100%; height: auto; border-radius: 5px;" />
  </div>

  <div style="flex: 2; min-width: 300px;">
    <h4 style="margin-top: 0;">✂️ <a href="https://alrightlone.github.io/OBS-Diff-Webpage/">OBS-Diff: Accurate Pruning for Diffusion Models in One-Shot</a></h4>
    <p>
      <em><b>Junhan Zhu</b>, Hesong Wang, Mingluo Su, Zefang Wang, Huan Wang*</em>
      <br>
      <a href="https://arxiv.org/abs/2510.06751"><img src="https://img.shields.io/badge/Preprint-arXiv-b31b1b.svg?style=flat-square"></a>
      <a href="https://github.com/Alrightlone/OBS-Diff"><img src="https://img.shields.io/github/stars/Alrightlone/OBS-Diff?style=flat-square&logo=github"></a>
    </p>
    <p>
      The <b>first training-free, one-shot pruning framework</b> for Diffusion Models, supporting diverse architectures and pruning granularities. Uses Optimal Brain Surgeon (OBS) to achieve <b>SOTA</b> compression with high generative quality.
    </p>
  </div>

</div></div>

OBS-Diff-SDXL provides a collection of structured-pruned checkpoints for the Stable Diffusion XL (SDXL) base model, compressed using the OBS-Diff framework. By leveraging an efficient one-shot pruning algorithm, this model significantly reduces the parameter count of the UNet while maintaining high-fidelity image generation capabilities. The provided variants cover a sparsity range from 10% to 30%, offering a trade-off between model size and performance.
![](sdxl1.png)
![](sdxl2.png)
![](sdxl3.png)
### Pruned UNet Variants
| Sparsity (%) | 0 (Dense) | 10 | 15 | 20 | 25 | 30 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
| **Params (B)** | 2.57 | 2.35 | 2.24 | 2.13 | 2.02 | 1.91 |

### How to use the pruned model

1. Download the base model (SDXL) from [huggingface](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) or ModelScope.

2. Download the pruned weights (.pth files) and use `torch.load` to replace the original UNet in the pipeline.

3. Run inference using the code below.

``` python
import os
import torch
from diffusers import DiffusionPipeline
from PIL import Image


# 1. Load the base SDXL model
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16)

# 2. Swap the original UNet with the pruned UNet checkpoint
# Note: Ensure the path points to your downloaded .pth file
pruned_unet_path = "/path/to/sparsity_30/unet_pruned.pth"
pipe.unet = torch.load(pruned_unet_path, weights_only=False)
pipe = pipe.to("cuda")

total_params = sum(p.numel() for p in pipe.unet.parameters())
print(f"Total UNet parameters: {total_params / 1e6:.2f} M")

image = pipe(
    prompt="A ship sailing through a sea of clouds, golden hour, impasto oil painting, brush strokes visible, dreamlike atmosphere.",
    negative_prompt=None,
    height=1024,
    width=1024,
    num_inference_steps=30,
    guidance_scale=7.0,
    generator=torch.Generator("cuda").manual_seed(42)
).images[0]

image.save("output_pruned.png")

```

### Citation
If you find this work useful, please consider citing:

```bibtex
@article{zhu2025obs,
  title={OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot},
  author={Zhu, Junhan and Wang, Hesong and Su, Mingluo and Wang, Zefang and Wang, Huan},
  journal={arXiv preprint arXiv:2510.06751},
  year={2025}
}
```