Text-to-Image
Diffusers
English
StableDiffusionPipeline
dreambooth
stable-diffusion
stable-diffusion-diffusers
Instructions to use lambda/dreambooth-avatar with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use lambda/dreambooth-avatar with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("lambda/dreambooth-avatar", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
| language: | |
| - en | |
| thumbnail: "https://staticassetbucket.s3.us-west-1.amazonaws.com/avatar_grid.png" | |
| tags: | |
| - dreambooth | |
| - stable-diffusion | |
| - stable-diffusion-diffusers | |
| - text-to-image | |
| # Dreambooth style: Avatar | |
| __Dreambooth finetuning of Stable Diffusion (v1.5.1) on Avatar art style by [Lambda Labs](https://lambdalabs.com/).__ | |
| ## About | |
| This text-to-image stable diffusion model was trained with dreambooth. | |
| Put in a text prompt and generate your own Avatar style image! | |
|  | |
| ## Usage | |
| To run model locally: | |
| ```bash | |
| pip install accelerate torchvision transformers>=4.21.0 ftfy tensorboard modelcards | |
| ``` | |
| ```python | |
| import torch | |
| from diffusers import StableDiffusionPipeline | |
| from torch import autocast | |
| pipe = StableDiffusionPipeline.from_pretrained("lambdalabs/dreambooth-avatar", torch_dtype=torch.float16) | |
| pipe = pipe.to("cuda") | |
| prompt = "Yoda, avatarart style" | |
| scale = 7.5 | |
| n_samples = 4 | |
| with autocast("cuda"): | |
| images = pipe(n_samples*[prompt], guidance_scale=scale).images | |
| for idx, im in enumerate(images): | |
| im.save(f"{idx:06}.png") | |
| ``` | |
| ## Model description | |
| Base model is Stable Diffusion v1.5 and was trained using Dreambooth with 60 input images sized 512x512 displaying Avatar character images. | |
| The model is learning to associate Avatar images with the style tokenized as 'avatarart style'. | |
| Prior preservation was used during training using the class 'Person' to avoid training bleeding into the representations for that class. | |
| Training ran on 2xA6000 GPUs on [Lambda GPU Cloud](https://lambdalabs.com/service/gpu-cloud) for 700 steps, batch size 4 (a couple hours, at a cost of about $4). | |
| Author: Eole Cervenka |