Spaces:

santhoshv6
/

sd-textual-inversion-contrast

Sleeping

App Files Files Community

sd-textual-inversion-contrast / README.md

santhoshv6

Upload README.md

40bfc22 verified 4 months ago

preview code

raw

history blame contribute delete

3.92 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

metadata

title: Stable Diffusion Textual Inversion + Custom Contrast Guidance
emoji: 🎭
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: creativeml-openrail-m

Stable Diffusion Textual Inversion + Custom Contrast Guidance

This Space lets you explore 5 textual inversion styles from the sd-concepts-library and a custom contrast-based guidance variant on top of Stable Diffusion v1.5.

Models and Styles

Base model: runwayml/stable-diffusion-v1-5
Textual inversion concepts (styles):
- <birb-style> → sd-concepts-library/birb-style
- <moebius> → sd-concepts-library/moebius
- <midjourney-style> → sd-concepts-library/midjourney-style
- <wlop-style> → sd-concepts-library/wlop-style
- <line-art> → sd-concepts-library/line-art

These are loaded as learned embeddings and used directly in the text prompt.

What the app does

Baseline mode
- Runs standard Stable Diffusion v1.5 with classifier-free guidance.
- Uses your prompt plus the selected style token, e.g.:
```
"A campfire oil painting at night, <birb-style>"
```
- No additional loss or guidance beyond the usual text conditioning.
Contrast variant mode
- Uses the same base sampling loop and seed as baseline.
- On later diffusion steps, applies a custom contrast-like adjustment in latent space:
  - Measures variance of the predicted “clean” latents.
  - Applies a deterministic update that pushes latents towards higher variance (higher contrast).
- This is a lightweight, creative variant of the “blue_loss” idea from the Stable Diffusion Deep Dive notebook, but it is not based on RGB channels; it operates directly on latents.

The result is a pair of images (baseline vs contrast variant) with the same prompt, style, and seed but a different “feel” due to the extra contrast guidance.

How to use

Prompt
- Type any text prompt in the Prompt box.
- Examples:
  - A campfire oil painting at night
  - A cinematic portrait of a wizard reading
  - A futuristic cityscape at sunrise
Style (concept)
- Choose one of the 5 textual inversion styles from the dropdown.
- Internally the style token (e.g. <birb-style>) is appended to your prompt.
Seed
- Set a seed to make runs reproducible.
- Use the same seed in both modes if you want a direct comparison.
Steps
- Diffusion steps (higher = slower but usually better).
- 30–40 is a good starting point.
Guidance scale
- Classifier-free guidance strength (text adherence).
- Typical values: 7–10.
Mode
- Baseline: standard Stable Diffusion with the selected style.
- Contrast variant: same setup, with additional latent contrast guidance.
Contrast scale
- Controls how strong the contrast adjustment is in the variant mode.
- Start low (around 5–15). Very high values can give very noisy, abstract images.

Implementation notes

Everything runs through StableDiffusionPipeline from 🤗 Diffusers.
Textual inversion embeddings are loaded via pipe.load_textual_inversion for each concept.
The contrast variant reuses the same scheduler, UNet, VAE, tokenizer, and text encoder as the baseline; only the update rule for latents is modified.
Safety checker is disabled here for educational use; please enable it for any public-facing or production deployment.

Credits

Stable Diffusion v1.5: runwayml/stable-diffusion-v1-5
Textual inversion concepts: sd-concepts-library
UI: Gradio Spaces