bubbliiiing commited on
Commit
14e6c45
Β·
1 Parent(s): 4a1c425

Update Weights

Browse files
.gitattributes CHANGED
@@ -1,3 +1,5 @@
 
 
1
  *.7z filter=lfs diff=lfs merge=lfs -text
2
  *.arrow filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
 
1
+ *.png filter=lfs diff=lfs merge=lfs -text
2
+ *.jpg filter=lfs diff=lfs merge=lfs -text
3
  *.7z filter=lfs diff=lfs merge=lfs -text
4
  *.arrow filter=lfs diff=lfs merge=lfs -text
5
  *.bin filter=lfs diff=lfs merge=lfs -text
Qwen-Image-2512-Fun-Controlnet-Union.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e0c280356ddc6c4b075a57ce47ef4446a724a96c2eb97e5736a9478687b6c9af
3
+ size 3512432536
README.md CHANGED
@@ -1,3 +1,119 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Qwen-Image-2512-Fun-Controlnet-Union
5
+
6
+ [![Github](https://img.shields.io/badge/🎬%20Code-VideoX_Fun-blue)](https://github.com/aigc-apps/VideoX-Fun)
7
+
8
+ ## Model Features
9
+ - This ControlNet is added on 5 layer blocks. It supports multiple control conditionsβ€”including Canny, HED, Depth, Pose, MLSD and Scribble. It can be used like a standard ControlNet.
10
+ - Inpainting mode is also supported.
11
+ - When obtaining control images, acquiring them in a multi-resolution manner results in better generalization.
12
+ - You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 1.00.
13
+
14
+ ## Results
15
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
16
+ <tr>
17
+ <td>Pose + Inpaint</td>
18
+ <td>Output</td>
19
+ </tr>
20
+ <tr>
21
+ <td><img src="asset/inpaint.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
22
+ <td><img src="results/pose_inpaint.png" width="100%" /></td>
23
+ </tr>
24
+ </table>
25
+
26
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
27
+ <tr>
28
+ <td>Pose</td>
29
+ <td>Output</td>
30
+ </tr>
31
+ <tr>
32
+ <td><img src="asset/pose2.jpg" width="100%" /></td>
33
+ <td><img src="results/pose2.png" width="100%" /></td>
34
+ </tr>
35
+ </table>
36
+
37
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
38
+ <tr>
39
+ <td>Pose</td>
40
+ <td>Output</td>
41
+ </tr>
42
+ <tr>
43
+ <td><img src="asset/pose.jpg" width="100%" /></td>
44
+ <td><img src="results/pose.png" width="100%" /></td>
45
+ </tr>
46
+ </table>
47
+
48
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
49
+ <tr>
50
+ <td>Scribble</td>
51
+ <td>Output</td>
52
+ </tr>
53
+ <tr>
54
+ <td><img src="asset/scribble.jpg" width="100%" /></td>
55
+ <td><img src="results/scribble.png" width="100%" /></td>
56
+ </tr>
57
+ </table>
58
+
59
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
60
+ <tr>
61
+ <td>Canny</td>
62
+ <td>Output</td>
63
+ </tr>
64
+ <tr>
65
+ <td><img src="asset/canny.jpg" width="100%" /></td>
66
+ <td><img src="results/canny.png" width="100%" /></td>
67
+ </tr>
68
+ </table>
69
+
70
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
71
+ <tr>
72
+ <td>HED</td>
73
+ <td>Output</td>
74
+ </tr>
75
+ <tr>
76
+ <td><img src="asset/hed.jpg" width="100%" /></td>
77
+ <td><img src="results/hed.png" width="100%" /></td>
78
+ </tr>
79
+ </table>
80
+
81
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
82
+ <tr>
83
+ <td>Depth</td>
84
+ <td>Output</td>
85
+ </tr>
86
+ <tr>
87
+ <td><img src="asset/depth.jpg" width="100%" /></td>
88
+ <td><img src="results/depth.png" width="100%" /></td>
89
+ </tr>
90
+ </table>
91
+
92
+ ## Inference
93
+ Go to the VideoX-Fun repository for more details.
94
+
95
+ Please clone the VideoX-Fun repository and create the required directories:
96
+
97
+ ```sh
98
+ # Clone the code
99
+ git clone https://github.com/aigc-apps/VideoX-Fun.git
100
+
101
+ # Enter VideoX-Fun's directory
102
+ cd VideoX-Fun
103
+
104
+ # Create model directories
105
+ mkdir -p models/Diffusion_Transformer
106
+ mkdir -p models/Personalized_Model
107
+ ```
108
+
109
+ Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
110
+
111
+ ```
112
+ πŸ“¦ models/
113
+ β”œβ”€β”€ πŸ“‚ Diffusion_Transformer/
114
+ β”‚ └── πŸ“‚ Qwen-Image-2512/
115
+ β”œβ”€β”€ πŸ“‚ Personalized_Model/
116
+ β”‚ └── πŸ“¦ Qwen-Image-2512-Fun-Controlnet-Union.safetensors
117
+ ```
118
+
119
+ Then run the file `examples/qwenimage_fun/predict_t2i_control.py` and `examples/qwenimage_fun/predict_i2i_inpaint.py`.
asset/canny.jpg ADDED

Git LFS Details

  • SHA256: 800790ae2e890e99b75dc1fc0a05142d22dbcdd9a961d2bc15222a4356683723
  • Pointer size: 131 Bytes
  • Size of remote file: 278 kB
asset/depth.jpg ADDED

Git LFS Details

  • SHA256: 6e2ba1022bb71d026c764b12e7d6c67a233cfa4c6836616f618a878764fe7a7c
  • Pointer size: 131 Bytes
  • Size of remote file: 106 kB
asset/hed.jpg ADDED

Git LFS Details

  • SHA256: c10f91fe342b439d1e99fe703e313aa09315b59cf7362c43e2e42910f7c681d7
  • Pointer size: 131 Bytes
  • Size of remote file: 188 kB
asset/inpaint.jpg ADDED

Git LFS Details

  • SHA256: e45be390cfe07c1b6bf4df94aa48223871d21ce0a42e91693636004299483aa2
  • Pointer size: 131 Bytes
  • Size of remote file: 524 kB
asset/mask.jpg ADDED

Git LFS Details

  • SHA256: c2012f7a9ed8eeefc75df2e7606eb1457c74d5a05a5f3a8d2c3ee6b287624d23
  • Pointer size: 130 Bytes
  • Size of remote file: 11.4 kB
asset/pose.jpg ADDED

Git LFS Details

  • SHA256: c3543f29a838b77933dc439f8520c5eff1bb2075315afbe6eb4b309c477a31f0
  • Pointer size: 130 Bytes
  • Size of remote file: 43.5 kB
asset/pose2.jpg ADDED

Git LFS Details

  • SHA256: 82005b3e813d714e3a4cf8dddbeddad5047978d6aca78c6a121ad1e7c0ec4b4e
  • Pointer size: 130 Bytes
  • Size of remote file: 94.6 kB
asset/scribble.jpg ADDED

Git LFS Details

  • SHA256: 772fee90b481e6e8a58cd3560109489dcae2dd68d873eaf889c9252a1c76d24b
  • Pointer size: 130 Bytes
  • Size of remote file: 36.1 kB
results/canny.png ADDED

Git LFS Details

  • SHA256: 2dc2ab6809b7d2045255a2918a6bf42602dd2e7b6a78cc34c7af867671f5dc75
  • Pointer size: 132 Bytes
  • Size of remote file: 1.7 MB
results/depth.png ADDED

Git LFS Details

  • SHA256: d776f7280c3096ba6643ad58678453424e04f8c56b794b3b799afceb57322049
  • Pointer size: 132 Bytes
  • Size of remote file: 1.07 MB
results/hed.png ADDED

Git LFS Details

  • SHA256: 4a3fca01f45ba869d61661612d048d1a314f8f1ca82035ab853ecf24a35465b2
  • Pointer size: 132 Bytes
  • Size of remote file: 1.66 MB
results/pose.png ADDED

Git LFS Details

  • SHA256: 30948a9629d770d3607a357c4dd122576a2bdcb03f6d310c68cff39a1f34c464
  • Pointer size: 132 Bytes
  • Size of remote file: 1.79 MB
results/pose2.png ADDED

Git LFS Details

  • SHA256: df935d5aee3767e3cf9ff334f44ef6d40a62dcc42417e64076edff85160dbf9e
  • Pointer size: 132 Bytes
  • Size of remote file: 1.94 MB
results/pose_inpaint.png ADDED

Git LFS Details

  • SHA256: 078e9cfa9355854ef6c67e8ab1455b8e4706ce15c86251b247e07a456781ada8
  • Pointer size: 132 Bytes
  • Size of remote file: 1.82 MB
results/scribble.png ADDED

Git LFS Details

  • SHA256: 55bec5dd5e653bc0db392dac408b4efd9cf60b5d81dae5f4f8802e6b504c276e
  • Pointer size: 132 Bytes
  • Size of remote file: 1.96 MB