jacklishufan
/

diffusion-kto

Model card Files Files and versions

jacklishufan commited on Apr 22, 2024

Commit

7ee83b5

·

verified ·

1 Parent(s): 2c46cc7

Create README.md

Files changed (1) hide show

README.md +58 -0

README.md ADDED Viewed

	@@ -0,0 +1,58 @@

+---
+datasets:
+- yuvalkirstain/pickapic_v2
+library_name: diffusers
+---
+# Diffusion-KTO: Aligning Diffusion Models by Optimizing Human Utility
+<p align="center">
+    <img src="https://github.com/jacklishufan/diffusion-kto/blob/main/assets/teaser.png?raw=true", width=60%> <br>
+</p>
+This model is fine-tuned from stable-diffusion-xl-base-1.0 on offline human preference data pickapic_v2 using KTO.
+### Usage
+```
+import torch
+from diffusers import AutoencoderKL, UNet2DConditionModel, DiffusionPipeline
+vae_path = model_name = "runwayml/stable-diffusion-v1-5"
+device = 'cuda'
+weight_dtype = torch.float16
+vae = AutoencoderKL.from_pretrained(
+    vae_path,
+    subfolder="vae",
+)
+unet = UNet2DConditionModel.from_pretrained(
+    "jacklishufan/diffusion-kto", subfolder="unet",
+)
+pipeline = DiffusionPipeline.from_pretrained(
+    model_name,
+    vae=vae,
+    unet=unet,
+    device=device,
+).to(device).to(weight_dtype)
+result = pipeline(
+    prompt="Self-portrait oil painting, a beautiful cyborg with golden hair, 8k",
+    num_inference_steps=50,
+    guidance_scale=7.0
+)
+img = result[0][0]
+```
+### Code
+The code is available [here](https://github.com/jacklishufan/diffusion-kto)
+### Citation
+```
+@misc{li2024aligning,
+      title={Aligning Diffusion Models by Optimizing Human Utility},
+      author={Shufan Li and Konstantinos Kallidromitis and Akash Gokul and Yusuke Kato and Kazuki Kozuka},
+      year={2024},
+      eprint={2404.04465},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV}
+}
+```