Update README.md
Browse files
README.md
CHANGED
|
@@ -16,14 +16,14 @@ tags:
|
|
| 16 |
base_model: CompVis/stable-diffusion-v1-4
|
| 17 |
---
|
| 18 |
|
| 19 |
-
# Model Card for
|
| 20 |
-
|
| 21 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 22 |
This model uses the KerasCV implementation of stability.ai's text-to-image model. Unlike other open-source alternatives like Hugging Face's Diffusers, KerasCV offers advantages such as XLA compilation and mixed precision support, resulting in state-of-the-art generation speed.
|
| 23 |
|
| 24 |
## Model Details
|
|
|
|
| 25 |
|
| 26 |
-
|
| 27 |
|
| 28 |
<!-- Provide a longer summary of what this model is. -->
|
| 29 |
This model can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (OpenCLIP-ViT/H) to generate high-quality Reniassance portraits from textual prompts.
|
|
@@ -39,11 +39,14 @@ This model uses the KerasCV implementation of stability.ai's text-to-image model
|
|
| 39 |
|
| 40 |
|
| 41 |
## To Generate your own Examples:
|
| 42 |
-
Install Dependencies
|
|
|
|
| 43 |
!pip install keras-cv==0.6.0 -q
|
| 44 |
!pip install -U tensorflow -q
|
| 45 |
!pip install keras-core -q
|
| 46 |
-
|
|
|
|
|
|
|
| 47 |
from textwrap import wrap
|
| 48 |
import os
|
| 49 |
import keras_cv
|
|
@@ -58,21 +61,30 @@ from keras_cv.models.stable_diffusion.image_encoder import ImageEncoder
|
|
| 58 |
from keras_cv.models.stable_diffusion.noise_scheduler import NoiseScheduler
|
| 59 |
from keras_cv.models.stable_diffusion.text_encoder import TextEncoder
|
| 60 |
from tensorflow import keras
|
| 61 |
-
|
|
|
|
|
|
|
| 62 |
my_base_model = keras_cv.models.StableDiffusion(img_width=512, img_height=512)
|
| 63 |
-
|
|
|
|
|
|
|
| 64 |
my_base_model.diffusion_model.load_weights('/path/to/file/renaissance_model.h5')
|
| 65 |
-
|
|
|
|
|
|
|
| 66 |
img = my_base_model.text_to_image(
|
| 67 |
prompt="A woman with an enigmatic smile against a dark background",
|
| 68 |
batch_size=1, # How many images to generate at once
|
| 69 |
num_steps=25, # Number of iterations (controls image quality)
|
| 70 |
seed=123, # Set this to always get the same image from the same prompt
|
| 71 |
)
|
| 72 |
-
|
|
|
|
|
|
|
| 73 |
def plot_images(images):
|
| 74 |
plt.figure(figsize=(5, 5))
|
| 75 |
plt.imshow(images)
|
| 76 |
plt.axis("off")
|
| 77 |
|
| 78 |
-
plot_images(img)
|
|
|
|
|
|
| 16 |
base_model: CompVis/stable-diffusion-v1-4
|
| 17 |
---
|
| 18 |
|
| 19 |
+
# Model Card for Renaissance Stable Diffusion
|
|
|
|
| 20 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 21 |
This model uses the KerasCV implementation of stability.ai's text-to-image model. Unlike other open-source alternatives like Hugging Face's Diffusers, KerasCV offers advantages such as XLA compilation and mixed precision support, resulting in state-of-the-art generation speed.
|
| 22 |
|
| 23 |
## Model Details
|
| 24 |
+
If you'd like to see more regarding our process, results, or additional information about this project, please navigate to the Wiki section of this repository also available [here](https://github.com/martingasparyan/Fine-Tune-Stable-Diffusion/wiki).
|
| 25 |
|
| 26 |
+
#### Model Description
|
| 27 |
|
| 28 |
<!-- Provide a longer summary of what this model is. -->
|
| 29 |
This model can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder (OpenCLIP-ViT/H) to generate high-quality Reniassance portraits from textual prompts.
|
|
|
|
| 39 |
|
| 40 |
|
| 41 |
## To Generate your own Examples:
|
| 42 |
+
### 1. Install Dependencies
|
| 43 |
+
```python
|
| 44 |
!pip install keras-cv==0.6.0 -q
|
| 45 |
!pip install -U tensorflow -q
|
| 46 |
!pip install keras-core -q
|
| 47 |
+
```
|
| 48 |
+
### 2. Imports
|
| 49 |
+
```python
|
| 50 |
from textwrap import wrap
|
| 51 |
import os
|
| 52 |
import keras_cv
|
|
|
|
| 61 |
from keras_cv.models.stable_diffusion.noise_scheduler import NoiseScheduler
|
| 62 |
from keras_cv.models.stable_diffusion.text_encoder import TextEncoder
|
| 63 |
from tensorflow import keras
|
| 64 |
+
```
|
| 65 |
+
### 3. Create a base Stable diffusion Model
|
| 66 |
+
```python
|
| 67 |
my_base_model = keras_cv.models.StableDiffusion(img_width=512, img_height=512)
|
| 68 |
+
```
|
| 69 |
+
### 4. Load Weights from the h5 model which is hosted on Hugging Face:
|
| 70 |
+
```python
|
| 71 |
my_base_model.diffusion_model.load_weights('/path/to/file/renaissance_model.h5')
|
| 72 |
+
```
|
| 73 |
+
### 5. Create a variable to hold the values of the to-be-generated image such as prompt, batch size, iterations, and seed
|
| 74 |
+
```python
|
| 75 |
img = my_base_model.text_to_image(
|
| 76 |
prompt="A woman with an enigmatic smile against a dark background",
|
| 77 |
batch_size=1, # How many images to generate at once
|
| 78 |
num_steps=25, # Number of iterations (controls image quality)
|
| 79 |
seed=123, # Set this to always get the same image from the same prompt
|
| 80 |
)
|
| 81 |
+
```
|
| 82 |
+
### 6. Display using the function:
|
| 83 |
+
```python
|
| 84 |
def plot_images(images):
|
| 85 |
plt.figure(figsize=(5, 5))
|
| 86 |
plt.imshow(images)
|
| 87 |
plt.axis("off")
|
| 88 |
|
| 89 |
+
plot_images(img)
|
| 90 |
+
```
|