bubbliiiing commited on
Commit
bfd61e2
Β·
verified Β·
1 Parent(s): 4b9ff82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +150 -149
README.md CHANGED
@@ -1,150 +1,151 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
- # Z-Image-Fun-Controlnet-Union-2.1
6
-
7
- [![Github](https://img.shields.io/badge/🎬%20Code-VideoX_Fun-blue)](https://github.com/aigc-apps/VideoX-Fun)
8
-
9
- ## Model Card
10
-
11
- | Name | Description |
12
- |--|--|
13
- | Z-Image-Fun-Controlnet-Union-2.1.safetensors | ControlNet weights for Z-Image. The model supports multiple control conditions such as Canny, Depth, Pose, MLSD, Scribble, Hed and Gray. This ControlNet is added on 15 layer blocks and 2 refiner layer blocks. |
14
- | Z-Image-Fun-Controlnet-Union-2.1-lite.safetensors | Uses the same training scheme as the 2601 version, but compared to the large version of the model, fewer layers have control added, resulting in weaker control conditions. This makes it suitable for larger control_context_scale values, and the generation results appear more natural. It is also suitable for lower-spec machines. |
15
-
16
- ## Model Features
17
- - This ControlNet is added on 15 layer blocks and 2 refiner layer blocks (Lite models are added on 3 layer blocks and 2 refiner blocks). It supports multiple control conditionsβ€”including Canny, Depth, Pose, MLSD, Scribble, Hed and Gray can be used like a standard ControlNet.
18
- - Inpainting mode is also supported.
19
- - You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 0.90.
20
-
21
- ## Results
22
-
23
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
24
- <tr>
25
- <td>Pose + Inpaint</td>
26
- <td>Output</td>
27
- </tr>
28
- <tr>
29
- <td><img src="asset/inpaint.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /></td>
30
- <td><img src="results/inpaint.png" width="100%" /></td>
31
- </tr>
32
- </table>
33
-
34
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
35
- <tr>
36
- <td>Pose + Inpaint</td>
37
- <td>Output</td>
38
- </tr>
39
- <tr>
40
- <td><img src="asset/inpaint.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
41
- <td><img src="results/pose_inpaint.png" width="100%" /></td>
42
- </tr>
43
- </table>
44
-
45
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
46
- <tr>
47
- <td>Pose</td>
48
- <td>Output</td>
49
- </tr>
50
- <tr>
51
- <td><img src="asset/pose2.jpg" width="100%" /></td>
52
- <td><img src="results/pose2.png" width="100%" /></td>
53
- </tr>
54
- </table>
55
-
56
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
57
- <tr>
58
- <td>Pose</td>
59
- <td>Output</td>
60
- </tr>
61
- <tr>
62
- <td><img src="asset/pose.jpg" width="100%" /></td>
63
- <td><img src="results/pose.png" width="100%" /></td>
64
- </tr>
65
- </table>
66
-
67
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
68
- <tr>
69
- <td>Pose</td>
70
- <td>Output</td>
71
- </tr>
72
- <tr>
73
- <td><img src="asset/pose3.jpg" width="100%" /></td>
74
- <td><img src="results/pose3.png" width="100%" /></td>
75
- </tr>
76
- </table>
77
-
78
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
79
- <tr>
80
- <td>Canny</td>
81
- <td>Output</td>
82
- </tr>
83
- <tr>
84
- <td><img src="asset/canny.jpg" width="100%" /></td>
85
- <td><img src="results/canny.png" width="100%" /></td>
86
- </tr>
87
- </table>
88
-
89
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
90
- <tr>
91
- <td>HED</td>
92
- <td>Output</td>
93
- </tr>
94
- <tr>
95
- <td><img src="asset/hed.jpg" width="100%" /></td>
96
- <td><img src="results/hed.png" width="100%" /></td>
97
- </tr>
98
- </table>
99
-
100
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
101
- <tr>
102
- <td>Depth</td>
103
- <td>Output</td>
104
- </tr>
105
- <tr>
106
- <td><img src="asset/depth.jpg" width="100%" /></td>
107
- <td><img src="results/depth.png" width="100%" /></td>
108
- </tr>
109
- </table>
110
-
111
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
112
- <tr>
113
- <td>Depth</td>
114
- <td>Output</td>
115
- </tr>
116
- <tr>
117
- <td><img src="asset/gray.jpg" width="100%" /></td>
118
- <td><img src="results/gray.png" width="100%" /></td>
119
- </tr>
120
- </table>
121
-
122
- ## Inference
123
- Go to the VideoX-Fun repository for more details.
124
-
125
- Please clone the VideoX-Fun repository and create the required directories:
126
-
127
- ```sh
128
- # Clone the code
129
- git clone https://github.com/aigc-apps/VideoX-Fun.git
130
-
131
- # Enter VideoX-Fun's directory
132
- cd VideoX-Fun
133
-
134
- # Create model directories
135
- mkdir -p models/Diffusion_Transformer
136
- mkdir -p models/Personalized_Model
137
- ```
138
-
139
- Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
140
-
141
- ```
142
- πŸ“¦ models/
143
- β”œβ”€β”€ πŸ“‚ Diffusion_Transformer/
144
- β”‚ └── πŸ“‚ Z-Image/
145
- β”œβ”€β”€ πŸ“‚ Personalized_Model/
146
- β”‚ β”œβ”€β”€ πŸ“¦ Z-Image-Fun-Controlnet-Union-2.1.safetensors
147
- β”‚ └── πŸ“¦ Z-Image-Fun-Controlnet-Union-2.1-lite.safetensors
148
- ```
149
-
 
150
  Then run the file `examples/z_image_fun/predict_t2i_control_2.1.py` and `examples/z_image_fun/predict_i2i_inpaint_2.1.py`.
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: videox_fun
4
+ ---
5
+
6
+ # Z-Image-Fun-Controlnet-Union-2.1
7
+
8
+ [![Github](https://img.shields.io/badge/🎬%20Code-VideoX_Fun-blue)](https://github.com/aigc-apps/VideoX-Fun)
9
+
10
+ ## Model Card
11
+
12
+ | Name | Description |
13
+ |--|--|
14
+ | Z-Image-Fun-Controlnet-Union-2.1.safetensors | ControlNet weights for Z-Image. The model supports multiple control conditions such as Canny, Depth, Pose, MLSD, Scribble, Hed and Gray. This ControlNet is added on 15 layer blocks and 2 refiner layer blocks. |
15
+ | Z-Image-Fun-Controlnet-Union-2.1-lite.safetensors | Uses the same training scheme as the 2601 version, but compared to the large version of the model, fewer layers have control added, resulting in weaker control conditions. This makes it suitable for larger control_context_scale values, and the generation results appear more natural. It is also suitable for lower-spec machines. |
16
+
17
+ ## Model Features
18
+ - This ControlNet is added on 15 layer blocks and 2 refiner layer blocks (Lite models are added on 3 layer blocks and 2 refiner blocks). It supports multiple control conditionsβ€”including Canny, Depth, Pose, MLSD, Scribble, Hed and Gray can be used like a standard ControlNet.
19
+ - Inpainting mode is also supported.
20
+ - You can adjust control_context_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for control_context_scale is from 0.65 to 0.90.
21
+
22
+ ## Results
23
+
24
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
25
+ <tr>
26
+ <td>Pose + Inpaint</td>
27
+ <td>Output</td>
28
+ </tr>
29
+ <tr>
30
+ <td><img src="asset/inpaint.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /></td>
31
+ <td><img src="results/inpaint.png" width="100%" /></td>
32
+ </tr>
33
+ </table>
34
+
35
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
36
+ <tr>
37
+ <td>Pose + Inpaint</td>
38
+ <td>Output</td>
39
+ </tr>
40
+ <tr>
41
+ <td><img src="asset/inpaint.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
42
+ <td><img src="results/pose_inpaint.png" width="100%" /></td>
43
+ </tr>
44
+ </table>
45
+
46
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
47
+ <tr>
48
+ <td>Pose</td>
49
+ <td>Output</td>
50
+ </tr>
51
+ <tr>
52
+ <td><img src="asset/pose2.jpg" width="100%" /></td>
53
+ <td><img src="results/pose2.png" width="100%" /></td>
54
+ </tr>
55
+ </table>
56
+
57
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
58
+ <tr>
59
+ <td>Pose</td>
60
+ <td>Output</td>
61
+ </tr>
62
+ <tr>
63
+ <td><img src="asset/pose.jpg" width="100%" /></td>
64
+ <td><img src="results/pose.png" width="100%" /></td>
65
+ </tr>
66
+ </table>
67
+
68
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
69
+ <tr>
70
+ <td>Pose</td>
71
+ <td>Output</td>
72
+ </tr>
73
+ <tr>
74
+ <td><img src="asset/pose3.jpg" width="100%" /></td>
75
+ <td><img src="results/pose3.png" width="100%" /></td>
76
+ </tr>
77
+ </table>
78
+
79
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
80
+ <tr>
81
+ <td>Canny</td>
82
+ <td>Output</td>
83
+ </tr>
84
+ <tr>
85
+ <td><img src="asset/canny.jpg" width="100%" /></td>
86
+ <td><img src="results/canny.png" width="100%" /></td>
87
+ </tr>
88
+ </table>
89
+
90
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
91
+ <tr>
92
+ <td>HED</td>
93
+ <td>Output</td>
94
+ </tr>
95
+ <tr>
96
+ <td><img src="asset/hed.jpg" width="100%" /></td>
97
+ <td><img src="results/hed.png" width="100%" /></td>
98
+ </tr>
99
+ </table>
100
+
101
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
102
+ <tr>
103
+ <td>Depth</td>
104
+ <td>Output</td>
105
+ </tr>
106
+ <tr>
107
+ <td><img src="asset/depth.jpg" width="100%" /></td>
108
+ <td><img src="results/depth.png" width="100%" /></td>
109
+ </tr>
110
+ </table>
111
+
112
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
113
+ <tr>
114
+ <td>Depth</td>
115
+ <td>Output</td>
116
+ </tr>
117
+ <tr>
118
+ <td><img src="asset/gray.jpg" width="100%" /></td>
119
+ <td><img src="results/gray.png" width="100%" /></td>
120
+ </tr>
121
+ </table>
122
+
123
+ ## Inference
124
+ Go to the VideoX-Fun repository for more details.
125
+
126
+ Please clone the VideoX-Fun repository and create the required directories:
127
+
128
+ ```sh
129
+ # Clone the code
130
+ git clone https://github.com/aigc-apps/VideoX-Fun.git
131
+
132
+ # Enter VideoX-Fun's directory
133
+ cd VideoX-Fun
134
+
135
+ # Create model directories
136
+ mkdir -p models/Diffusion_Transformer
137
+ mkdir -p models/Personalized_Model
138
+ ```
139
+
140
+ Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
141
+
142
+ ```
143
+ πŸ“¦ models/
144
+ β”œβ”€β”€ πŸ“‚ Diffusion_Transformer/
145
+ β”‚ └── πŸ“‚ Z-Image/
146
+ β”œβ”€β”€ πŸ“‚ Personalized_Model/
147
+ β”‚ β”œβ”€β”€ πŸ“¦ Z-Image-Fun-Controlnet-Union-2.1.safetensors
148
+ β”‚ └── πŸ“¦ Z-Image-Fun-Controlnet-Union-2.1-lite.safetensors
149
+ ```
150
+
151
  Then run the file `examples/z_image_fun/predict_t2i_control_2.1.py` and `examples/z_image_fun/predict_i2i_inpaint_2.1.py`.