metadata
library_name: diffusers
pipeline_tag: text-to-image
tags:
- safety
- classifier-guidance
- stable-diffusion
- plug-and-play
license: apache-2.0
Safe Diffusion Guidance (SDG) — plug-and-play safety layer for Stable Diffusion
Safe Diffusion Guidance (SDG) is a classifier-guided denoising layer that steers the sampling trajectory away from unsafe content without retraining the base model.
It works standalone with SD 1.4 / 1.5 / 2.1 and composes cleanly with ESD/UCE/SLD.
- Safety signal: a 5-class mid-UNet feature classifier (classes:
gore, hate, medical, safe, sexual) trained on (1280×8×8) features. - Controls:
safety_scale(strength),mid_fraction(fraction of steps guided). - Plug-in: drop into any SD pipeline, or stack on top of ESD/UCE/SLD.
- No retraining: small gradient nudges to latents during denoising.
Note on metrics (matching our paper): FID/KID are computed vs. baseline model outputs rather than real images; baseline FID/KID are ≈0 by construction.
Quickstart (SD 1.5)
import torch
from diffusers import StableDiffusionPipeline
# 1) Load base SD pipeline (disable default safety checker)
base = StableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16,
safety_checker=None
).to("cuda")
# 2) Load SDG custom pipeline from Hub (this repo)
sdg = StableDiffusionPipeline.from_pretrained(
"your-org/safe-diffusion-guidance",
custom_pipeline="safe_diffusion_guidance",
torch_dtype=torch.float16
).to("cuda")
img = sdg(
base_pipe=base,
prompt="portrait photograph, studio light, 85mm, realistic",
num_inference_steps=50,
guidance_scale=7.5,
safety_scale=5.0, # strength: ~2–8 (Light→Strong)
mid_fraction=1.0, # guide fraction of steps: 0.5, 0.8, 1.0
safe_class_index=3 # index of 'safe' in [gore,hate,medical,safe,sexual]
).images[0]
img.save("sdg_safe_output.png")