Instructions to use Adeely93/SAGE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Adeely93/SAGE with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Adeely93/SAGE", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
| library_name: diffusers | |
| license: mit | |
| pipeline_tag: text-to-image | |
| tags: | |
| - text-to-image | |
| - safety-alignment | |
| - stable-diffusion | |
| - ECCV | |
| # SAGE: Structure-Aware Geometric Regularization (ECCV-26) | |
| **Paper:** [The Illusion of High Utility in Safety Alignment of Text-to-Image Diffusion Models](https://huggingface.co/papers/2607.00402) | |
| **Authors:** Adeel Yousaf, Soumik Ghosh, James Beetham, Amrit Singh Bedi, Mubarak Shah | |
| **Institution:** University of Central Florida | |
| **Project Page:** [https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/](https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/) | |
| --- | |
| ## Overview | |
| We show that existing T2I safety alignment methods create an **illusion of high utility** — they appear to maintain high utility under coarse metrics (FID, CLIPScore) but suffer significant drops in fine-grained semantic fidelity (TIFA). We trace this to **semantic collapse** in the text encoder embedding space. | |
| **SAGE** is a geometry-aware safety alignment method that preserves embedding spread and local similarity structure during fine-tuning, achieving only a **−1.2% TIFA drop** vs. **−6.2% for DES** while maintaining strong safety (Avg. ASR 1.2%). | |
| --- | |
| ## Use this Model | |
| ```python | |
| import torch | |
| from diffusers import StableDiffusionPipeline | |
| from huggingface_hub import hf_hub_download | |
| # Load base pipeline | |
| pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") | |
| # Download and load SAGE text encoder weights | |
| ckpt_path = hf_hub_download(repo_id="Adeely93/SAGE", filename="SAGE.pt") | |
| pipe.text_encoder.load_state_dict(torch.load(ckpt_path, map_location="cpu")) | |
| pipe = pipe.to("cuda") | |
| image = pipe("a photo of a dog in a park").images[0] | |
| ``` |