Text-to-Image
Diffusers
Safetensors
diffusion
stable-diffusion
lora
policy-distillation
reinforcement-learning
academic-research
Instructions to use quanhaol/DiffusionOPD with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use quanhaol/DiffusionOPD with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-3.5-medium", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("quanhaol/DiffusionOPD") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
File size: 4,266 Bytes
2608d77 ab98021 2608d77 ab98021 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | ---
license: cc-by-4.0
base_model: stabilityai/stable-diffusion-3.5-medium
library_name: diffusers
tags:
- diffusion
- stable-diffusion
- text-to-image
- lora
- policy-distillation
- reinforcement-learning
- academic-research
pipeline_tag: text-to-image
---
<h1 align="center">DiffusionOPD</h1>
<h3 align="center">A Unified Perspective of On-Policy Distillation in Diffusion Models</h3>
<div align="center">
<a href="https://arxiv.org/abs/2605.15055"><img src="https://img.shields.io/badge/Paper%20(arXiv)-2605.15055-red?logo=arxiv"></a>
<a href="https://quanhaol.github.io/DiffusionOPD-site/"><img src="https://img.shields.io/badge/Website-green?logo=homepage&logoColor=white"></a>
<a href="https://github.com/ali-vilab/DiffusionOPD"><img src="https://img.shields.io/badge/Code-9E95B7?logo=github"></a>
</div>
## Model Card
This Hugging Face repository hosts the released LoRA checkpoints for **DiffusionOPD**, an online policy distillation framework for multi-task diffusion alignment.
DiffusionOPD first trains task-specialized teacher models and then distills their capabilities into one unified student along the student's own rollout trajectories. The released checkpoints are intended to support reproducible academic research on diffusion alignment, reward optimization, and on-policy distillation.
## Released Checkpoints
The repository contains LoRA checkpoints trained from `stabilityai/stable-diffusion-3.5-medium`:
| Directory | Description |
| --- | --- |
| `Student/lora` | Unified DiffusionOPD student distilled from multiple task-specialized teachers. |
| `AesTeacher/lora` | Teacher optimized for aesthetic preference alignment. |
| `GenEvalTeacher/lora` | Teacher optimized for GenEval-style compositional generation. |
| `OCRTeacher/lora` | Teacher optimized for OCR/text rendering capability. |
## Base Model
These checkpoints are LoRA adapters for:
`stabilityai/stable-diffusion-3.5-medium`
Powered by Stability AI.
## Usage
Please use the official DiffusionOPD codebase for loading the LoRA checkpoints and running evaluation:
```bash
git clone https://github.com/ali-vilab/DiffusionOPD.git
cd DiffusionOPD
conda create -n DiffusionOPD python=3.10.16
conda activate DiffusionOPD
pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu126
pip install -e .
bash scripts/single_node/eval.sh
```
The evaluation script supports datasets including `geneval`, `ocr`, `pickscore`, and `drawbench`.
## Intended Use
This project is released for academic research only. It is intended for studying diffusion model alignment, reward-guided optimization, on-policy distillation, and evaluation of text-to-image generation systems.
Users are responsible for ensuring that their use of these checkpoints complies with all applicable laws, research ethics requirements, the Creative Commons Attribution license for this release, and the license terms of the underlying Stability AI model.
## License
The DiffusionOPD released checkpoints and model card are provided under the **Creative Commons Attribution 4.0 International (CC BY 4.0)** license.
Because these checkpoints are based on `stabilityai/stable-diffusion-3.5-medium`, use of the base model is also governed by the Stability AI Community License:
https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/blob/main/LICENSE.md
Required Stability AI attribution:
> "This Stability AI Model is licensed under the Stability AI Community License, Copyright © Stability AI Ltd. All Rights Reserved"
> Powered by Stability AI
## Citation
If you find this project useful, please cite:
```bibtex
@article{li2026diffusionopd,
title={DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models},
author={Li, Quanhao and Yu, Junqiu and Jiang, Kaixun and Wei, Yujie and Xing, Zhen and Li, Pandeng and Chu, Ruihang and Zhang, Shiwei and Liu, Yu and Wu, Zuxuan},
journal={arXiv preprint arXiv:2605.15055},
year={2026}
}
```
## Acknowledgements
We thank the [Flow-GRPO](https://github.com/yifan123/flow_grpo), [DiffusionNFT](https://github.com/NVlabs/DiffusionNFT), and [Stable Diffusion 3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) projects.
|