VideoX Fun
bubbliiiing commited on
Commit
b2a5317
·
1 Parent(s): 5f2e9bf

Update Flux.2 Control 2602

Browse files
FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:516532a885d12ae84bb3c6b24ef4816ac05ffa1c9c7b93476f74652eb0a7a794
3
+ size 8232506680
README.md CHANGED
@@ -4,37 +4,30 @@ license: other
4
  license_name: flux-dev-non-commercial-license
5
  license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICENSE.txt
6
  ---
 
7
  # Flux.2-dev-Fun-Controlnet-Union
8
 
9
  [![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
10
 
 
 
 
 
 
 
 
11
  # Model features
12
  - This ControlNet is added on 4 double blocks.
13
- - The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
14
  - It supports multiple control conditions—including Canny, HED, depth maps, pose estimation, and MLSD can be used like a standard ControlNet.
15
  - Inpainting mode is also supported.
16
  - You can adjust controlnet_conditioning_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
17
  - Although Flux.2‑dev supports certain image‑editing capabilities, its generation speed slows down when handling multiple images, and it sometimes produces similarity issues or fails to follow the control images. Compared with edit‑based methods, using ControlNet adheres more reliably to control instructions and makes it easier to apply multiple types of control.
18
 
19
- # TODO
20
- - [ ] Train more data and steps.
21
-
22
  # Results
23
 
24
  <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
25
  <tr>
26
- <td>Pose</td>
27
- <td>Output</td>
28
- </tr>
29
- <tr>
30
- <td><img src="asset/ref.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /></td>
31
- <td><img src="results/inpaint.png" width="100%" /></td>
32
- </tr>
33
- </table>
34
-
35
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
36
- <tr>
37
- <td>Pose</td>
38
  <td>Output</td>
39
  </tr>
40
  <tr>
@@ -78,7 +71,18 @@ license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICE
78
 
79
  <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
80
  <tr>
81
- <td>Canny</td>
 
 
 
 
 
 
 
 
 
 
 
82
  <td>Output</td>
83
  </tr>
84
  <tr>
@@ -87,6 +91,28 @@ license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICE
87
  </tr>
88
  </table>
89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  # Inference
91
  Go to VideoX-Fun repository for more details.
92
 
@@ -110,7 +136,8 @@ Then download weights to models/Diffusion_Transformer and models/Personalized_Mo
110
  ├── 📂 Diffusion_Transformer/
111
  │ └── 📂 FLUX.2-dev/
112
  ├── 📂 Personalized_Model/
113
- └── "models/Personalized_Model/FLUX.2-dev-Fun-Controlnet-Union.safetensors"
 
114
  ```
115
 
116
  Then run the file `examples/flux2_fun/predict_t2i_control.py`.
 
4
  license_name: flux-dev-non-commercial-license
5
  license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICENSE.txt
6
  ---
7
+
8
  # Flux.2-dev-Fun-Controlnet-Union
9
 
10
  [![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
11
 
12
+ ## Model Card
13
+
14
+ | Name | Description |
15
+ |--|--|
16
+ | FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors | Compared to the previous version of the model, we have added Scribble and Gray controls. Similar to Z-Image-Turbo, the Flux2 model loses its CFG distillation capability after Control training, which is why the previous version performed poorly. Building upon the previous version, we trained with a better dataset and performed CFG distillation after training, resulting in superior performance. |
17
+ | FLUX.2-dev-Fun-Controlnet-Union.safetensors | ControlNet weights for Flux2. The model supports multiple control conditions such as Canny, Depth, Pose, MLSD, Scribble, Hed and Gray. This ControlNet is added on 15 layer blocks and 2 refiner layer blocks. |
18
+
19
  # Model features
20
  - This ControlNet is added on 4 double blocks.
 
21
  - It supports multiple control conditions—including Canny, HED, depth maps, pose estimation, and MLSD can be used like a standard ControlNet.
22
  - Inpainting mode is also supported.
23
  - You can adjust controlnet_conditioning_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
24
  - Although Flux.2‑dev supports certain image‑editing capabilities, its generation speed slows down when handling multiple images, and it sometimes produces similarity issues or fails to follow the control images. Compared with edit‑based methods, using ControlNet adheres more reliably to control instructions and makes it easier to apply multiple types of control.
25
 
 
 
 
26
  # Results
27
 
28
  <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
29
  <tr>
30
+ <td>Pose + Ref</td>
 
 
 
 
 
 
 
 
 
 
 
31
  <td>Output</td>
32
  </tr>
33
  <tr>
 
71
 
72
  <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
73
  <tr>
74
+ <td>HED</td>
75
+ <td>Output</td>
76
+ </tr>
77
+ <tr>
78
+ <td><img src="asset/hed.jpg" width="100%" /></td>
79
+ <td><img src="results/hed.png" width="100%" /></td>
80
+ </tr>
81
+ </table>
82
+
83
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
84
+ <tr>
85
+ <td>Depth</td>
86
  <td>Output</td>
87
  </tr>
88
  <tr>
 
91
  </tr>
92
  </table>
93
 
94
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
95
+ <tr>
96
+ <td>Gray</td>
97
+ <td>Output</td>
98
+ </tr>
99
+ <tr>
100
+ <td><img src="asset/gray.jpg" width="100%" /></td>
101
+ <td><img src="results/gray.png" width="100%" /></td>
102
+ </tr>
103
+ </table>
104
+
105
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
106
+ <tr>
107
+ <td>Pose + Inpaint</td>
108
+ <td>Output</td>
109
+ </tr>
110
+ <tr>
111
+ <td><img src="asset/ref.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
112
+ <td><img src="results/pose_inpaint.png" width="100%" /></td>
113
+ </tr>
114
+ </table>
115
+
116
  # Inference
117
  Go to VideoX-Fun repository for more details.
118
 
 
136
  ├── 📂 Diffusion_Transformer/
137
  │ └── 📂 FLUX.2-dev/
138
  ├── 📂 Personalized_Model/
139
+ ├── 📦 FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors
140
+ │ └── 📦 FLUX.2-dev-Fun-Controlnet-Union.safetensors
141
  ```
142
 
143
  Then run the file `examples/flux2_fun/predict_t2i_control.py`.
asset/gray.jpg ADDED

Git LFS Details

  • SHA256: 6bd84884bc99e86aa46618bf182d1dbcb5c6ec41fbd78bd6cbad725e44d5b179
  • Pointer size: 132 Bytes
  • Size of remote file: 1.06 MB
asset/hed.jpg ADDED

Git LFS Details

  • SHA256: 368b3a7f73e3ed1f7d8134de1fb8cd52ffb3a9a026c7df3d0d0e6068b20309b8
  • Pointer size: 131 Bytes
  • Size of remote file: 796 kB
results/canny.png CHANGED

Git LFS Details

  • SHA256: 447966712050333149e2181f3de3d47313b78561ffa2e76f18c13be656b2ae33
  • Pointer size: 132 Bytes
  • Size of remote file: 1.6 MB

Git LFS Details

  • SHA256: 74b0cce4d8faa241a44f7ce53740e79f30dbe1fc937243f563dd53e8c84fc829
  • Pointer size: 132 Bytes
  • Size of remote file: 2.42 MB
results/depth.png CHANGED

Git LFS Details

  • SHA256: 14e44f9bfec0e752d7ea84f443f514ecc97f92406e9af7637ab57ac1694a999a
  • Pointer size: 132 Bytes
  • Size of remote file: 1.12 MB

Git LFS Details

  • SHA256: d476ebf7ddc338a7ea3277be3525d7c3b676b49c24067ac406117f6da1e25cc2
  • Pointer size: 132 Bytes
  • Size of remote file: 2.03 MB
results/gray.png ADDED

Git LFS Details

  • SHA256: 77cefec88070ae21e7af1a7796b9279d9c73af6d1a01df42d5ae7ec10f92db70
  • Pointer size: 132 Bytes
  • Size of remote file: 2.85 MB
results/hed.png ADDED

Git LFS Details

  • SHA256: 6680aa4994bc631a10acc3ed541f3601850a8caf8d9466ed77821bba90e6fcd3
  • Pointer size: 132 Bytes
  • Size of remote file: 3.58 MB
results/pose.png CHANGED

Git LFS Details

  • SHA256: 37f16b9cfaaec484ed403495a1c86a76ee3d850138ec43b4a0ff4e0e37445a3a
  • Pointer size: 132 Bytes
  • Size of remote file: 1.23 MB

Git LFS Details

  • SHA256: 19bd862f08f2971a247635b8586fa5d9c9c42de5451c6a4a5bd11df838f16645
  • Pointer size: 132 Bytes
  • Size of remote file: 2.02 MB
results/pose2.png CHANGED

Git LFS Details

  • SHA256: bd46aba5d908886eb656b158ea5d8f301d49c6630ecab4ba1f066a9d5016bdb0
  • Pointer size: 132 Bytes
  • Size of remote file: 1.5 MB

Git LFS Details

  • SHA256: 74185345de9132466efe5e83f66ff6c4ea5cb6cec4a9d150468f553223a85116
  • Pointer size: 132 Bytes
  • Size of remote file: 1.91 MB
results/pose_inpaint.png ADDED

Git LFS Details

  • SHA256: bd6cdc7a4b5166282a618a43249d56c4ac0e52f1b78c23360ffd65e4d75f249f
  • Pointer size: 132 Bytes
  • Size of remote file: 1.74 MB
results/pose_ref.png CHANGED

Git LFS Details

  • SHA256: 724dc38ba909c6642ca5b6caf5204459f01b2c0b185025696378fdf6f9eab613
  • Pointer size: 132 Bytes
  • Size of remote file: 1.82 MB

Git LFS Details

  • SHA256: b03bb6eaf9763820b5d05767d81d11a2467a19ac67eba16972d9de9c0481e916
  • Pointer size: 132 Bytes
  • Size of remote file: 2.1 MB