Patrick Esser
commited on
Commit
·
b3874d2
1
Parent(s):
bfd8d6f
update readme
Browse filesFormer-commit-id: d39f5b51a8d607fd855425a0d546b9f871034c3d
README.md
CHANGED
|
@@ -78,6 +78,9 @@ steps show the relative improvements of the checkpoints:
|
|
| 78 |
|
| 79 |
Stable Diffusion is a latent diffusion model conditioned on the (non-pooled) text embeddings of a CLIP ViT-L/14 text encoder.
|
| 80 |
|
|
|
|
|
|
|
|
|
|
| 81 |
After [obtaining the weights](#weights), link them
|
| 82 |
```
|
| 83 |
mkdir -p models/ldm/stable-diffusion-v1/
|
|
@@ -88,24 +91,6 @@ and sample with
|
|
| 88 |
python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
|
| 89 |
```
|
| 90 |
|
| 91 |
-
Another way to download and sample Stable Diffusion is by using the [diffusers library](https://github.com/huggingface/diffusers/tree/main#new--stable-diffusion-is-now-fully-compatible-with-diffusers)
|
| 92 |
-
```py
|
| 93 |
-
# make sure you're logged in with `huggingface-cli login`
|
| 94 |
-
from torch import autocast
|
| 95 |
-
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
|
| 96 |
-
|
| 97 |
-
pipe = StableDiffusionPipeline.from_pretrained(
|
| 98 |
-
"CompVis/stable-diffusion-v1-3-diffusers",
|
| 99 |
-
use_auth_token=True
|
| 100 |
-
)
|
| 101 |
-
|
| 102 |
-
prompt = "a photo of an astronaut riding a horse on mars"
|
| 103 |
-
with autocast("cuda"):
|
| 104 |
-
image = pipe(prompt)["sample"][0]
|
| 105 |
-
|
| 106 |
-
image.save("astronaut_rides_horse.png")
|
| 107 |
-
```
|
| 108 |
-
|
| 109 |
By default, this uses a guidance scale of `--scale 7.5`, [Katherine Crowson's implementation](https://github.com/CompVis/latent-diffusion/pull/51) of the [PLMS](https://arxiv.org/abs/2202.09778) sampler,
|
| 110 |
and renders images of size 512x512 (which it was trained on) in 50 steps. All supported arguments are listed below (type `python scripts/txt2img.py --help`).
|
| 111 |
|
|
@@ -149,6 +134,28 @@ non-EMA to EMA weights. If you want to examine the effect of EMA vs no EMA, we p
|
|
| 149 |
which contain both types of weights. For these, `use_ema=False` will load and use the non-EMA weights.
|
| 150 |
|
| 151 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
### Image Modification with Stable Diffusion
|
| 153 |
|
| 154 |
By using a diffusion-denoising mechanism as first proposed by [SDEdit](https://arxiv.org/abs/2108.01073), the model can be used for different
|
|
|
|
| 78 |
|
| 79 |
Stable Diffusion is a latent diffusion model conditioned on the (non-pooled) text embeddings of a CLIP ViT-L/14 text encoder.
|
| 80 |
|
| 81 |
+
|
| 82 |
+
#### Sampling Script
|
| 83 |
+
|
| 84 |
After [obtaining the weights](#weights), link them
|
| 85 |
```
|
| 86 |
mkdir -p models/ldm/stable-diffusion-v1/
|
|
|
|
| 91 |
python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
|
| 92 |
```
|
| 93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
By default, this uses a guidance scale of `--scale 7.5`, [Katherine Crowson's implementation](https://github.com/CompVis/latent-diffusion/pull/51) of the [PLMS](https://arxiv.org/abs/2202.09778) sampler,
|
| 95 |
and renders images of size 512x512 (which it was trained on) in 50 steps. All supported arguments are listed below (type `python scripts/txt2img.py --help`).
|
| 96 |
|
|
|
|
| 134 |
which contain both types of weights. For these, `use_ema=False` will load and use the non-EMA weights.
|
| 135 |
|
| 136 |
|
| 137 |
+
#### Diffusers Integration
|
| 138 |
+
|
| 139 |
+
Another way to download and sample Stable Diffusion is by using the [diffusers library](https://github.com/huggingface/diffusers/tree/main#new--stable-diffusion-is-now-fully-compatible-with-diffusers)
|
| 140 |
+
```py
|
| 141 |
+
# make sure you're logged in with `huggingface-cli login`
|
| 142 |
+
from torch import autocast
|
| 143 |
+
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
|
| 144 |
+
|
| 145 |
+
pipe = StableDiffusionPipeline.from_pretrained(
|
| 146 |
+
"CompVis/stable-diffusion-v1-3-diffusers",
|
| 147 |
+
use_auth_token=True
|
| 148 |
+
)
|
| 149 |
+
|
| 150 |
+
prompt = "a photo of an astronaut riding a horse on mars"
|
| 151 |
+
with autocast("cuda"):
|
| 152 |
+
image = pipe(prompt)["sample"][0]
|
| 153 |
+
|
| 154 |
+
image.save("astronaut_rides_horse.png")
|
| 155 |
+
```
|
| 156 |
+
|
| 157 |
+
|
| 158 |
+
|
| 159 |
### Image Modification with Stable Diffusion
|
| 160 |
|
| 161 |
By using a diffusion-denoising mechanism as first proposed by [SDEdit](https://arxiv.org/abs/2108.01073), the model can be used for different
|