| | --- |
| | license: creativeml-openrail-m |
| | library_name: diffusers |
| | tags: |
| | - text-to-image |
| | - dreambooth |
| | - diffusers-training |
| | - stable-diffusion |
| | - stable-diffusion-diffusers |
| | base_model: runwayml/stable-diffusion-v1-5 |
| | inference: true |
| | instance_prompt: disney style |
| | --- |
| | |
| | <!-- This model card has been generated automatically according to the information the training script had access to. You |
| | should probably proofread and complete it, then remove this comment. --> |
| |
|
| |
|
| | # Cartoonify |
| |
|
| | This is a dreambooth model derived from `runwayml/stable-diffusion-v1-5` with additional fine-tuning of the text encoder. The weights were trained from a popular animation studio using [DreamBooth](https://dreambooth.github.io/). Use the tokens **_disney style_** in your prompts for the effect. |
| |
|
| | You can find some example images below: |
| |
|
| | <p float="left"> |
| | <img width=256 height=256 src="./images/king.png"> |
| | <img width=256 height=256 src="./images/legend_of_zelda.png"> |
| | <img width=256 height=256 src="./images/pony.png"> |
| | <img width=256 height=256 src="./images/princess.png"> |
| | <img width=256 height=256 src="./images/red_ferrari.png"> |
| | </p> |
| | |
| | ## Intended uses & limitations |
| |
|
| | #### How to use |
| |
|
| | ```python |
| | import torch |
| | from diffusers import StableDiffusionPipeline |
| | |
| | # basic usage |
| | repo_id = "lavaman131/cartoonify" |
| | device = torch.device("cuda") |
| | torch_dtype = torch.float16 if device.type in ["mps", "cuda"] else torch.float32 |
| | pipeline = StableDiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch_dtype).to(device) |
| | image = pipeline("PROMPT GOES HERE").images[0] |
| | image.save("output.png") |
| | ``` |
| |
|
| | #### Full source code |
| |
|
| | The full source-code used for training and local gradio demo for image to disney character style transfer can be found [here](https://github.com/lavaman131/cartoonify). |
| |
|
| | #### Limitations and bias |
| |
|
| | As with any diffusion model, playing around with the prompt and classifier-free guidance parameter is required until you get the results you want. Zoomed-out subjects seem to loose clairity in the face. For additional safety in image generation, we use the Stable Diffusion safety checker. |
| |
|
| | ## Training details |
| |
|
| | The model was fine-tuned for 3500 steps on around 200 images of modern Disney characters, backgrounds, and animals. The ratios for each were 70%, 20%, and 10% respectively on an RTX A5000 GPU (24GB VRAM). |
| |
|
| | The training code used can be found [here](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py). The regularization images used for training can be found [here](https://github.com/aitrepreneur/SD-Regularization-Images-Style-Dreambooth/tree/main/style_ddim). |
| |
|