Update Weights

Files changed (18) hide show

.gitattributes +2 -0
Qwen-Image-2512-Fun-Controlnet-Union.safetensors +3 -0
README.md +116 -0
asset/canny.jpg +3 -0
asset/depth.jpg +3 -0
asset/hed.jpg +3 -0
asset/inpaint.jpg +3 -0
asset/mask.jpg +3 -0
asset/pose.jpg +3 -0
asset/pose2.jpg +3 -0
asset/scribble.jpg +3 -0
results/canny.png +3 -0
results/depth.png +3 -0
results/hed.png +3 -0
results/pose.png +3 -0
results/pose2.png +3 -0
results/pose_inpaint.png +3 -0
results/scribble.png +3 -0

.gitattributes CHANGED Viewed

@@ -1,3 +1,5 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text

+*.png filter=lfs diff=lfs merge=lfs -text
+*.jpg filter=lfs diff=lfs merge=lfs -text
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text

Qwen-Image-2512-Fun-Controlnet-Union.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e0c280356ddc6c4b075a57ce47ef4446a724a96c2eb97e5736a9478687b6c9af
+size 3512432536

README.md CHANGED Viewed

@@ -1,3 +1,119 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+# Qwen-Image-2512-Fun-Controlnet-Union
+[![Github](https://img.shields.io/badge/🎬%20Code-VideoX_Fun-blue)](https://github.com/aigc-apps/VideoX-Fun)
+## Model Features
+- This ControlNet is added on 5 layer blocks. It supports multiple control conditions—including Canny, HED, Depth, Pose, MLSD and Scribble. It can be used like a standard ControlNet.
+- Inpainting mode is also supported.
+- When obtaining control images, acquiring them in a multi-resolution manner results in better generalization.
+- You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 1.00.
+## Results
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Pose + Inpaint</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/inpaint.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
+    <td><img src="results/pose_inpaint.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Pose</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/pose2.jpg" width="100%" /></td>
+    <td><img src="results/pose2.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Pose</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/pose.jpg" width="100%" /></td>
+    <td><img src="results/pose.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Scribble</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/scribble.jpg" width="100%" /></td>
+    <td><img src="results/scribble.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Canny</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/canny.jpg" width="100%" /></td>
+    <td><img src="results/canny.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>HED</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/hed.jpg" width="100%" /></td>
+    <td><img src="results/hed.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Depth</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/depth.jpg" width="100%" /></td>
+    <td><img src="results/depth.png" width="100%" /></td>
+  </tr>
+</table>
+## Inference
+Go to the VideoX-Fun repository for more details.
+Please clone the VideoX-Fun repository and create the required directories:
+```sh
+# Clone the code
+git clone https://github.com/aigc-apps/VideoX-Fun.git
+# Enter VideoX-Fun's directory
+cd VideoX-Fun
+# Create model directories
+mkdir -p models/Diffusion_Transformer
+mkdir -p models/Personalized_Model
+```
+Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
+```
+📦 models/
+├── 📂 Diffusion_Transformer/
+│   └── 📂 Qwen-Image-2512/
+├── 📂 Personalized_Model/
+│   └── 📦 Qwen-Image-2512-Fun-Controlnet-Union.safetensors
+```
+Then run the file `examples/qwenimage_fun/predict_t2i_control.py` and `examples/qwenimage_fun/predict_i2i_inpaint.py`.