HighCWu
/

control-lora-v3

+---
+base_model: runwayml/stable-diffusion-v1-5
+library_name: diffusers
+license: creativeml-openrail-m
+tags:
+- stable-diffusion
+- stable-diffusion-diffusers
+- text-to-image
+- diffusers
+- controlnet
+- control-lora-v3
+- diffusers-training
+inference: true
+---
+<!-- This model card has been generated automatically according to the information the training script had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# control-lora-v3
+This is a collections of control-lora-v3 weights trained on runwayml/stable-diffusion-v1-5 and stabilityai/stable-diffusion-xl-base-1.0 with different types of conditioning.
+You can find some example images below.
+## Stable Diffusion
+### Canny
+<div style="display: flex; flex-wrap: wrap;">
+  <img src="./imgs/canny1.png" style="height:256px;" />
+  <img src="./imgs/canny2.png" style="height:256px;" />
+  <img src="./imgs/canny3.png" style="height:256px;" />
+  <img src="./imgs/canny4.png" style="height:256px;" />
+  <img src="./imgs/canny_vermeer.png" style="height:256px;" />
+</div>
+### OpenPose + Segmentation
+This is experimental, and it doesn't work well.
+<div style="display: flex; flex-wrap: wrap;">
+  <img src="./imgs/pose5_segmentation5.png" style="height:256px;" />
+  <img src="./imgs/pose6_segmentation6.png" style="height:256px;" />
+  <img src="./imgs/pose7_segmentation7.png" style="height:256px;" />
+  <img src="./imgs/pose8_segmentation8.png" style="height:256px;" />
+</div>
+### Depth
+<div style="display: flex; flex-wrap: wrap;">
+  <img src="./imgs/depth1.png" style="height:256px;" />
+  <img src="./imgs/depth2.png" style="height:256px;" />
+  <img src="./imgs/depth3.png" style="height:256px;" />
+  <img src="./imgs/depth4.png" style="height:256px;" />
+</div>
+### Normal map
+<div style="display: flex; flex-wrap: wrap;">
+  <img src="./imgs/normal1.png" style="height:256px;" />
+  <img src="./imgs/normal2.png" style="height:256px;" />
+  <img src="./imgs/normal3.png" style="height:256px;" />
+  <img src="./imgs/normal4.png" style="height:256px;" />
+</div>
+### OpenPose
+<div style="display: flex; flex-wrap: wrap;">
+  <img src="./imgs/pose1.png" style="height:256px;" />
+  <img src="./imgs/pose2.png" style="height:256px;" />
+  <img src="./imgs/pose3.png" style="height:256px;" />
+  <img src="./imgs/pose4.png" style="height:256px;" />
+  <img src="./imgs/pose5.png" style="height:256px;" />
+  <img src="./imgs/pose6.png" style="height:256px;" />
+  <img src="./imgs/pose7.png" style="height:256px;" />
+  <img src="./imgs/pose8.png" style="height:256px;" />
+</div>
+### Segmentation
+<div style="display: flex; flex-wrap: wrap;">
+  <img src="./imgs/segmentation1.png" style="height:256px;" />
+  <img src="./imgs/segmentation2.png" style="height:256px;" />
+  <img src="./imgs/segmentation3.png" style="height:256px;" />
+  <img src="./imgs/segmentation4.png" style="height:256px;" />
+  <img src="./imgs/segmentation5.png" style="height:256px;" />
+  <img src="./imgs/segmentation6.png" style="height:256px;" />
+  <img src="./imgs/segmentation7.png" style="height:256px;" />
+  <img src="./imgs/segmentation8.png" style="height:256px;" />
+</div>
+### Tile
+<div style="display: flex; flex-wrap: wrap;">
+  <img src="./imgs/tile1.png" style="height:256px;" />
+  <img src="./imgs/tile2.png" style="height:256px;" />
+  <img src="./imgs/tile3.png" style="height:256px;" />
+  <img src="./imgs/tile4.png" style="height:256px;" />
+</div>
+## Stable Diffusion
+### Canny
+<div style="display: flex;">
+  <img src="./imgs/sdxl_canny1.png" style="height:256px;" />
+  <img src="./imgs/sdxl_canny2.png" style="height:256px;" />
+  <img src="./imgs/sdxl_canny3.png" style="height:256px;" />
+  <img src="./imgs/sdxl_canny4.png" style="height:256px;" />
+  <img src="./imgs/sdxl_canny_vermeer.png" style="height:256px;" />
+</div>
+## Intended uses & limitations
+#### How to use
+First clone the [control-lora-v3](https://github.com/HighCWu/control-lora-v3) and `cd` in the directory:
+```sh
+git clone https://github.com/HighCWu/control-lora-v3
+cd control-lora-v3
+```
+Then run the python code。
+For stable diffusion, use:
+```py
+# !pip install opencv-python transformers accelerate
+from diffusers import UniPCMultistepScheduler
+from diffusers.utils import load_image
+from model import UNet2DConditionModelEx
+from pipeline import StableDiffusionControlLoraV3Pipeline
+import numpy as np
+import torch
+import cv2
+from PIL import Image
+# download an image
+image = load_image(
+    "https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
+)
+image = np.array(image)
+# get canny image
+image = cv2.Canny(image, 100, 200)
+image = image[:, :, None]
+image = np.concatenate([image, image, image], axis=2)
+canny_image = Image.fromarray(image)
+# load stable diffusion v1-5 and control-lora-v3
+unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", subfolder="unet", torch_dtype=torch.float16
+)
+unet = unet.add_extra_conditions(["canny"])
+pipe = StableDiffusionControlLoraV3Pipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", unet=unet, torch_dtype=torch.float16
+)
+# load attention processors
+pipe.load_lora_weights("HighCWu/sd-control-lora-v3-canny")
+# speed up diffusion process with faster scheduler and memory optimization
+pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
+# remove following line if xformers is not installed
+pipe.enable_xformers_memory_efficient_attention()
+pipe.enable_model_cpu_offload()
+# generate image
+generator = torch.manual_seed(0)
+image = pipe(
+    "futuristic-looking woman", num_inference_steps=20, generator=generator, image=canny_image
+).images[0]
+image.show()
+```
+For stable diffusion xl, use:
+```py
+# !pip install opencv-python transformers accelerate
+from diffusers import AutoencoderKL
+from diffusers.utils import load_image
+from model import UNet2DConditionModelEx
+from pipeline_sdxl import StableDiffusionXLControlLoraV3Pipeline
+import numpy as np
+import torch
+import cv2
+from PIL import Image
+prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
+negative_prompt = "low quality, bad quality, sketches"
+# download an image
+image = load_image(
+    "https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png"
+)
+# initialize the models and pipeline
+unet: UNet2DConditionModelEx = UNet2DConditionModelEx.from_pretrained(
+    "stabilityai/stable-diffusion-xl-base-1.0", subfolder="unet", torch_dtype=torch.float16
+)
+unet = unet.add_extra_conditions(["canny"])
+vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
+pipe = StableDiffusionXLControlLoraV3Pipeline.from_pretrained(
+    "stabilityai/stable-diffusion-xl-base-1.0", unet=unet, vae=vae, torch_dtype=torch.float16
+)
+# load attention processors
+pipe.load_lora_weights("HighCWu/sdxl-control-lora-v3-canny")
+pipe.enable_model_cpu_offload()
+# get canny image
+image = np.array(image)
+image = cv2.Canny(image, 100, 200)
+image = image[:, :, None]
+image = np.concatenate([image, image, image], axis=2)
+canny_image = Image.fromarray(image)
+# generate image
+image = pipe(
+    prompt, image=canny_image
+).images[0]
+image.show()
+```
+#### Limitations and bias
+[TODO: provide examples of latent issues and potential remediations]
+## Training details
+[TODO: describe the data used to train the model]