tezuesh commited on
Commit
82e74c0
·
1 Parent(s): 50c976f

Add model files

Browse files
Files changed (4) hide show
  1. LICENSE +14 -0
  2. README.md +59 -0
  3. v1-1-to-v1-5.png +0 -0
  4. v1-variants-scores.jpg +0 -0
LICENSE ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ All rights reserved by the authors.
2
+ You must not distribute the weights provided to you directly or indirectly without explicit consent of the authors.
3
+ You must not distribute harmful, offensive, dehumanizing content or otherwise harmful representations of people or their environments, cultures, religions, etc. produced with the model weights
4
+ or other generated content described in the "Misuse and Malicious Use" section in the model card.
5
+ The model weights are provided for research purposes only.
6
+
7
+
8
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
9
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
10
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
11
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
12
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
13
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
14
+ SOFTWARE.
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: creativeml-openrail-m
3
+ tags:
4
+ - stable-diffusion
5
+ - text-to-image
6
+ inference: false
7
+ ---
8
+ # Stable Diffusion
9
+
10
+ Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.
11
+ This model card gives an overview of all available model checkpoints. For more in-detail model cards, please have a look at the model repositories listed under [Model Access](#model-access).
12
+
13
+ ## Stable Diffusion Version 1
14
+
15
+ For the first version 4 model checkpoints are released.
16
+ *Higher* versions have been trained for longer and are thus usually better in terms of image generation quality then *lower* versions. More specifically:
17
+
18
+ - **stable-diffusion-v1-1**: The checkpoint is randomly initialized and has been trained on 237,000 steps at resolution `256x256` on [laion2B-en](https://huggingface.co/datasets/laion/laion2B-en).
19
+ 194,000 steps at resolution `512x512` on [laion-high-resolution](https://huggingface.co/datasets/laion/laion-high-resolution) (170M examples from LAION-5B with resolution `>= 1024x1024`).
20
+ - **stable-diffusion-v1-2**: The checkpoint resumed training from `stable-diffusion-v1-1`.
21
+ 515,000 steps at resolution `512x512` on "laion-improved-aesthetics" (a subset of laion2B-en,
22
+ filtered to images with an original size `>= 512x512`, estimated aesthetics score `> 5.0`, and an estimated watermark probability `< 0.5`. The watermark estimate is from the LAION-5B metadata, the aesthetics score is estimated using an [improved aesthetics estimator](https://github.com/christophschuhmann/improved-aesthetic-predictor)).
23
+ - **stable-diffusion-v1-3**: The checkpoint resumed training from `stable-diffusion-v1-2`. 195,000 steps at resolution `512x512` on "laion-improved-aesthetics" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598)
24
+ - **stable-diffusion-v1-4**: The checkpoint resumed training from `stable-diffusion-v1-2`. 195,000 steps at resolution `512x512` on "laion-improved-aesthetics" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
25
+ - [**`stable-diffusion-v1-4`**](https://huggingface.co/CompVis/stable-diffusion-v1-4) Resumed from `stable-diffusion-v1-2`.225,000 steps at resolution `512x512` on "laion-aesthetics v2 5+" and 10 % dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).
26
+
27
+ ### Model Access
28
+
29
+ Each checkpoint can be used both with Hugging Face's [ 🧨 Diffusers library](https://github.com/huggingface/diffusers) or the original [Stable Diffusion GitHub repository](https://github.com/CompVis/stable-diffusion). Note that you have to *"click-request"* them on each respective model repository.
30
+
31
+ | **[🤗's 🧨 Diffusers library](https://github.com/huggingface/diffusers)** | **[Stable Diffusion GitHub repository](https://github.com/CompVis/stable-diffusion)** |
32
+ | ----------- | ----------- |
33
+ | [`stable-diffusion-v1-1`](https://huggingface.co/CompVis/stable-diffusion-v1-1) | [`stable-diffusion-v-1-1-original`](https://huggingface.co/CompVis/stable-diffusion-v-1-1-original) |
34
+ | [`stable-diffusion-v1-2`](https://huggingface.co/CompVis/stable-diffusion-v1-2) | [`stable-diffusion-v-1-2-original`](https://huggingface.co/CompVis/stable-diffusion-v-1-2-original) |
35
+ | [`stable-diffusion-v1-3`](https://huggingface.co/CompVis/stable-diffusion-v1-3) | [`stable-diffusion-v-1-3-original`](https://huggingface.co/CompVis/stable-diffusion-v-1-3-original) |
36
+ | [`stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) | [`stable-diffusion-v-1-4-original`](https://huggingface.co/CompVis/stable-diffusion-v-1-4-original) |
37
+
38
+ ### Demo
39
+
40
+ To quickly try out the model, you can try out the [Stable Diffusion Space](https://huggingface.co/spaces/stabilityai/stable-diffusion).
41
+
42
+ ### License
43
+
44
+ [The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses), adapted from the work that [BigScience](https://bigscience.huggingface.co/) and [the RAIL Initiative](https://www.licenses.ai/) are jointly carrying in the area of responsible AI licensing. See also [the article about the BLOOM Open RAIL license](https://bigscience.huggingface.co/blog/the-bigscience-rail-license) on which our license is based.
45
+
46
+ ## Citation
47
+
48
+ ```bibtex
49
+ @InProceedings{Rombach_2022_CVPR,
50
+ author = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
51
+ title = {High-Resolution Image Synthesis With Latent Diffusion Models},
52
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
53
+ month = {June},
54
+ year = {2022},
55
+ pages = {10684-10695}
56
+ }
57
+ ```
58
+
59
+ *This model card was written by: Robin Rombach and Patrick Esser and is based on the [DALL-E Mini model card](https://huggingface.co/dalle-mini/dalle-mini).*
v1-1-to-v1-5.png ADDED
v1-variants-scores.jpg ADDED