Update Flux.2 Control 2602

Files changed (12) hide show

FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors +3 -0
README.md +45 -18
asset/gray.jpg +3 -0
asset/hed.jpg +3 -0
results/canny.png +2 -2
results/depth.png +2 -2
results/gray.png +3 -0
results/hed.png +3 -0
results/pose.png +2 -2
results/pose2.png +2 -2
results/pose_inpaint.png +3 -0
results/pose_ref.png +2 -2

FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:516532a885d12ae84bb3c6b24ef4816ac05ffa1c9c7b93476f74652eb0a7a794
+size 8232506680

README.md CHANGED Viewed

@@ -4,37 +4,30 @@ license: other
 license_name: flux-dev-non-commercial-license
 license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICENSE.txt
 ---
 # Flux.2-dev-Fun-Controlnet-Union
 [![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
 # Model features
 - This ControlNet is added on 4 double blocks.
-- The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
 - It supports multiple control conditions—including Canny, HED, depth maps, pose estimation, and MLSD can be used like a standard ControlNet.
 - Inpainting mode is also supported.
 - You can adjust controlnet_conditioning_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
 - Although Flux.2‑dev supports certain image‑editing capabilities, its generation speed slows down when handling multiple images, and it sometimes produces similarity issues or fails to follow the control images. Compared with edit‑based methods, using ControlNet adheres more reliably to control instructions and makes it easier to apply multiple types of control.
-# TODO
-- [ ] Train more data and steps.
 # Results
 <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
   <tr>
-    <td>Pose</td>
-    <td>Output</td>
-  </tr>
-  <tr>
-    <td><img src="asset/ref.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /></td>
-    <td><img src="results/inpaint.png" width="100%" /></td>
-  </tr>
-</table>
-<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
-  <tr>
-    <td>Pose</td>
     <td>Output</td>
   </tr>
   <tr>
@@ -78,7 +71,18 @@ license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICE
 <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
   <tr>
-    <td>Canny</td>
     <td>Output</td>
   </tr>
   <tr>
@@ -87,6 +91,28 @@ license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICE
   </tr>
 </table>
 # Inference
 Go to VideoX-Fun repository for more details.
@@ -110,7 +136,8 @@ Then download weights to models/Diffusion_Transformer and models/Personalized_Mo
 ├── 📂 Diffusion_Transformer/
 │   └── 📂 FLUX.2-dev/
 ├── 📂 Personalized_Model/
-│   └── "models/Personalized_Model/FLUX.2-dev-Fun-Controlnet-Union.safetensors"
 ```
 Then run the file `examples/flux2_fun/predict_t2i_control.py`.

 license_name: flux-dev-non-commercial-license
 license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICENSE.txt
 ---
 # Flux.2-dev-Fun-Controlnet-Union
 [![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
+## Model Card
+| Name | Description |
+|--|--|
+| FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors | Compared to the previous version of the model, we have added Scribble and Gray controls. Similar to Z-Image-Turbo, the Flux2 model loses its CFG distillation capability after Control training, which is why the previous version performed poorly. Building upon the previous version, we trained with a better dataset and performed CFG distillation after training, resulting in superior performance. |
+| FLUX.2-dev-Fun-Controlnet-Union.safetensors | ControlNet weights for Flux2. The model supports multiple control conditions such as Canny, Depth, Pose, MLSD, Scribble, Hed and Gray. This ControlNet is added on 15 layer blocks and 2 refiner layer blocks. |
 # Model features
 - This ControlNet is added on 4 double blocks.
 - It supports multiple control conditions—including Canny, HED, depth maps, pose estimation, and MLSD can be used like a standard ControlNet.
 - Inpainting mode is also supported.
 - You can adjust controlnet_conditioning_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
 - Although Flux.2‑dev supports certain image‑editing capabilities, its generation speed slows down when handling multiple images, and it sometimes produces similarity issues or fails to follow the control images. Compared with edit‑based methods, using ControlNet adheres more reliably to control instructions and makes it easier to apply multiple types of control.
 # Results
 <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
   <tr>
+    <td>Pose + Ref</td>
     <td>Output</td>
   </tr>
   <tr>
 <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
   <tr>
+    <td>HED</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/hed.jpg" width="100%" /></td>
+    <td><img src="results/hed.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Depth</td>
     <td>Output</td>
   </tr>
   <tr>
   </tr>
 </table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Gray</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/gray.jpg" width="100%" /></td>
+    <td><img src="results/gray.png" width="100%" /></td>
+  </tr>
+</table>
+<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
+  <tr>
+    <td>Pose + Inpaint</td>
+    <td>Output</td>
+  </tr>
+  <tr>
+    <td><img src="asset/ref.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
+    <td><img src="results/pose_inpaint.png" width="100%" /></td>
+  </tr>
+</table>
 # Inference
 Go to VideoX-Fun repository for more details.
 ├── 📂 Diffusion_Transformer/
 │   └── 📂 FLUX.2-dev/
 ├── 📂 Personalized_Model/
+│   ├── 📦 FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors
+│   └── 📦 FLUX.2-dev-Fun-Controlnet-Union.safetensors
 ```
 Then run the file `examples/flux2_fun/predict_t2i_control.py`.