Instructions to use Xixixixihahahaha/RealAlign-SD-1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Xixixixihahahaha/RealAlign-SD-1.5 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("Xixixixihahahaha/RealAlign-SD-1.5") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
RealAlign β SD-1.5 LoRA Checkpoints
LoRA weights for Stable Diffusion v1.5, aligned with RealAlign from the paper "When Preference Labels Fall Short: Aligning Diffusion Models from Real Data" (ICML 2026).
| Resource | Link |
|---|---|
| π Paper | arXiv:2605.19839 |
| π Project page | cwyxx.github.io/RealAlign |
| π Code | github.com/Cwyxx/RealAlign |
| π€ Dataset | RealAlign-Dataset |
Summary
RealAlign aligns text-to-image diffusion models using real data as the preference signal: instead of human-annotated preference pairs, it treats a high-quality reference image as the preferred ("win") sample and a perturbed/inpainted version as the non-preferred ("lose") sample. These LoRA adapters are the result of fine-tuning SD-1.5 with RealAlign's two-stage procedure:
- Stage 1 β Diffusion-DRO (inverse RL / distributionally-robust objective), LoRA + LoRA-init.
- Stage 2 β Diffusion-DPO with LoRA-init, warm-started from the Stage 1 LoRA.
Checkpoints
Each file is a separate LoRA, trained on preference pairs from a different curation source:
| File | Training source |
|---|---|
HPDv3.safetensors |
HPDv3 (real-photo references) |
Civitai-top.safetensors |
Civitai top SFW images |
Pick-a-pic-v2.safetensors |
Pick-a-Pic v2 (top subset) |
- Format: diffusers-style UNet LoRA (
unet.*.lora_A/lora_B.weight), fp32. - LoRA rank: 4, applied to the UNet self- and cross-attention projections
(
to_q,to_k,to_v,to_out).
Note: The Pick-a-Pic v2 LoRA is included here, but the corresponding image dataset is not released on the Hub because the source data may contain NSFW content (see the dataset card).
Usage
import torch
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained(
"stable-diffusion-v1-5/stable-diffusion-v1-5",
torch_dtype=torch.float16,
).to("cuda")
# Pick the LoRA trained on the source you want
pipe.load_lora_weights(
"Xixixixihahahaha/RealAlign-SD-1.5",
weight_name="HPDv3.safetensors",
)
image = pipe("a photo of an astronaut riding a horse on the moon").images[0]
image.save("out.png")
Citation
@article{chen2026preference,
title={When Preference Labels Fall Short: Aligning Diffusion Models from Real Data},
author={Chen, Weiyan and Deng, Weijian and Xiao, Yao and Tu, Weijie and Dong, ZiYi and Radwan, Ibrahim and Lin, Liang and Wei, Pengxu},
journal={arXiv preprint arXiv:2605.19839},
year={2026}
}
- Downloads last month
- 32
Model tree for Xixixixihahahaha/RealAlign-SD-1.5
Base model
stable-diffusion-v1-5/stable-diffusion-v1-5