Instructions to use Xixixixihahahaha/RealAlign-SD-3.5-M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Xixixixihahahaha/RealAlign-SD-3.5-M with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-3.5-medium", dtype=torch.bfloat16, device_map="cuda") pipe.load_lora_weights("Xixixixihahahaha/RealAlign-SD-3.5-M") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - PEFT
How to use Xixixixihahahaha/RealAlign-SD-3.5-M with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-3.5-medium", dtype=torch.bfloat16, device_map="cuda")
pipe.load_lora_weights("Xixixixihahahaha/RealAlign-SD-3.5-M")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]RealAlign β SD-3.5-M LoRA Checkpoints
LoRA adapters for Stable Diffusion 3.5 Medium, aligned with RealAlign from the paper "When Preference Labels Fall Short: Aligning Diffusion Models from Real Data" (ICML 2026).
| Resource | Link |
|---|---|
| π Paper | arXiv:2605.19839 |
| π Project page | cwyxx.github.io/RealAlign |
| π Code | github.com/Cwyxx/RealAlign |
| π€ Dataset | RealAlign-Dataset |
Summary
RealAlign aligns text-to-image diffusion models using real data as the preference signal: instead of human-annotated preference pairs, it treats a high-quality reference image as the preferred ("win") sample and a perturbed/inpainted version as the non-preferred ("lose") sample. These adapters are the result of fine-tuning SD-3.5-M with RealAlign's two-stage procedure:
- Stage 1 β Diffusion-DRO (inverse RL / distributionally-robust objective), LoRA-init.
- Stage 2 β Diffusion-DPO with LoRA-init, warm-started from the Stage 1 LoRA.
Checkpoints
Each subfolder is a PEFT LoRA adapter trained on preference pairs from a different curation source:
| Subfolder | Training source |
|---|---|
HPDv3/ |
HPDv3 (real-photo references) |
Civitai-top/ |
Civitai top SFW images |
Each contains adapter_config.json + adapter_model.safetensors.
- Format: PEFT LoRA on the
SD3Transformer2DModel. - LoRA config:
r=32,lora_alpha=64, gaussian init, applied to the joint-attention projections (to_q/k/v,to_out,add_q/k/v_proj,to_add_out).
Note: The Pick-a-Pic v2 subset used in the paper is not released here because the source image data may contain NSFW content (see the dataset card).
Usage
import torch
from diffusers import StableDiffusion3Pipeline
pipe = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3.5-medium",
torch_dtype=torch.bfloat16,
).to("cuda")
# Load the adapter trained on the source you want (subfolder = HPDv3 or Civitai-top)
pipe.load_lora_weights(
"Xixixixihahahaha/RealAlign-SD-3.5-M",
subfolder="HPDv3",
)
image = pipe("a photo of an astronaut riding a horse on the moon").images[0]
image.save("out.png")
Citation
@article{chen2026preference,
title={When Preference Labels Fall Short: Aligning Diffusion Models from Real Data},
author={Chen, Weiyan and Deng, Weijian and Xiao, Yao and Tu, Weijie and Dong, ZiYi and Radwan, Ibrahim and Lin, Liang and Wei, Pengxu},
journal={arXiv preprint arXiv:2605.19839},
year={2026}
}
- Downloads last month
- -
Model tree for Xixixixihahahaha/RealAlign-SD-3.5-M
Base model
stabilityai/stable-diffusion-3.5-medium