Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ base_model: Qwen/Qwen-Image
|
|
| 14 |
---
|
| 15 |
|
| 16 |
# Qwen-Image-ControlNet-Union
|
| 17 |
-
This repository provides a unified ControlNet that supports 4 control types (canny, soft edge, depth, pose) for [Qwen-Image](https://
|
| 18 |
|
| 19 |
|
| 20 |
# Model Cards
|
|
@@ -48,19 +48,17 @@ This repository provides a unified ControlNet that supports 4 control types (can
|
|
| 48 |
import torch
|
| 49 |
from diffusers.utils import load_image
|
| 50 |
|
| 51 |
-
#
|
| 52 |
-
|
| 53 |
-
from
|
| 54 |
-
from pipeline_qwenimage_controlnet import QwenImageControlNetPipeline
|
| 55 |
|
| 56 |
base_model = "Qwen/Qwen-Image"
|
| 57 |
controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"
|
| 58 |
|
| 59 |
controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
|
| 60 |
-
transformer = QwenImageTransformer2DModel.from_pretrained(base_model, subfolder="transformer", torch_dtype=torch.bfloat16)
|
| 61 |
|
| 62 |
pipe = QwenImageControlNetPipeline.from_pretrained(
|
| 63 |
-
base_model, controlnet=controlnet,
|
| 64 |
)
|
| 65 |
pipe.to("cuda")
|
| 66 |
|
|
@@ -70,15 +68,15 @@ control_image = load_image("conds/canny.png")
|
|
| 70 |
prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
|
| 71 |
controlnet_conditioning_scale = 1.0
|
| 72 |
|
| 73 |
-
# soft edge
|
| 74 |
# control_image = load_image("conds/soft_edge.png")
|
| 75 |
# prompt = "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy."
|
| 76 |
-
# controlnet_conditioning_scale = 0
|
| 77 |
|
| 78 |
# depth
|
| 79 |
# control_image = load_image("conds/depth.png")
|
| 80 |
# prompt = "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor."
|
| 81 |
-
# controlnet_conditioning_scale = 0
|
| 82 |
|
| 83 |
# pose
|
| 84 |
# control_image = load_image("conds/pose.png")
|
|
@@ -99,7 +97,7 @@ image = pipe(
|
|
| 99 |
image.save(f"qwenimage_cn_union_result.png")
|
| 100 |
```
|
| 101 |
|
| 102 |
-
#
|
| 103 |
You can adjust control strength via controlnet_conditioning_scale.
|
| 104 |
- Canny: use cv2.Canny, set controlnet_conditioning_scale in [0.8, 1.0]
|
| 105 |
- Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), set controlnet_conditioning_scale in [0.8, 1.0]
|
|
@@ -108,11 +106,16 @@ You can adjust control strength via controlnet_conditioning_scale.
|
|
| 108 |
|
| 109 |
We strongly recommend using detailed prompts, especially when include text elements. For example, use "a poster with text 'InstantX Team' on the top" instead of "a poster".
|
| 110 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 111 |
# Community Support
|
| 112 |
-
[Liblib AI](https://www.liblib.art/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://www.liblib.art/) for more details.
|
| 113 |
|
| 114 |
# Limitations
|
| 115 |
-
We find that the model was unable to preserve some details, such as small font text.
|
| 116 |
|
| 117 |
# Acknowledgements
|
| 118 |
This model is developed by InstantX Team. All copyright reserved.
|
|
|
|
| 14 |
---
|
| 15 |
|
| 16 |
# Qwen-Image-ControlNet-Union
|
| 17 |
+
This repository provides a unified ControlNet that supports 4 common control types (canny, soft edge, depth, pose) for [Qwen-Image](https://github.com/QwenLM/Qwen-Image).
|
| 18 |
|
| 19 |
|
| 20 |
# Model Cards
|
|
|
|
| 48 |
import torch
|
| 49 |
from diffusers.utils import load_image
|
| 50 |
|
| 51 |
+
# https://github.com/huggingface/diffusers/pull/12215
|
| 52 |
+
# pip install git+https://github.com/huggingface/diffusers
|
| 53 |
+
from diffusers import QwenImageControlNetPipeline, QwenImageControlNetModel
|
|
|
|
| 54 |
|
| 55 |
base_model = "Qwen/Qwen-Image"
|
| 56 |
controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"
|
| 57 |
|
| 58 |
controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
|
|
|
|
| 59 |
|
| 60 |
pipe = QwenImageControlNetPipeline.from_pretrained(
|
| 61 |
+
base_model, controlnet=controlnet, torch_dtype=torch.bfloat16
|
| 62 |
)
|
| 63 |
pipe.to("cuda")
|
| 64 |
|
|
|
|
| 68 |
prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
|
| 69 |
controlnet_conditioning_scale = 1.0
|
| 70 |
|
| 71 |
+
# soft edge
|
| 72 |
# control_image = load_image("conds/soft_edge.png")
|
| 73 |
# prompt = "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy."
|
| 74 |
+
# controlnet_conditioning_scale = 1.0
|
| 75 |
|
| 76 |
# depth
|
| 77 |
# control_image = load_image("conds/depth.png")
|
| 78 |
# prompt = "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor."
|
| 79 |
+
# controlnet_conditioning_scale = 1.0
|
| 80 |
|
| 81 |
# pose
|
| 82 |
# control_image = load_image("conds/pose.png")
|
|
|
|
| 97 |
image.save(f"qwenimage_cn_union_result.png")
|
| 98 |
```
|
| 99 |
|
| 100 |
+
# Inference Setting
|
| 101 |
You can adjust control strength via controlnet_conditioning_scale.
|
| 102 |
- Canny: use cv2.Canny, set controlnet_conditioning_scale in [0.8, 1.0]
|
| 103 |
- Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), set controlnet_conditioning_scale in [0.8, 1.0]
|
|
|
|
| 106 |
|
| 107 |
We strongly recommend using detailed prompts, especially when include text elements. For example, use "a poster with text 'InstantX Team' on the top" instead of "a poster".
|
| 108 |
|
| 109 |
+
For multiple conditions inference, please refer to [PR](https://github.com/huggingface/diffusers/pull/12215).
|
| 110 |
+
|
| 111 |
+
# ComfyUI Support
|
| 112 |
+
[ComfyUI](https://www.comfy.org/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://github.com/comfyanonymous/ComfyUI/pull/9488) for more details.
|
| 113 |
+
|
| 114 |
# Community Support
|
| 115 |
+
[Liblib AI](https://www.liblib.art/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://www.liblib.art/modelinfo/4d3f51c2bf1e4c51ae8dedd8c19da827?from=personal_page&versionUuid=5b5f21d2b80445598db19e924bd3a409) for more details.
|
| 116 |
|
| 117 |
# Limitations
|
| 118 |
+
We find that the model was unable to preserve some details without explicit 'TEXT' in prompt, such as small font text.
|
| 119 |
|
| 120 |
# Acknowledgements
|
| 121 |
This model is developed by InstantX Team. All copyright reserved.
|