FLUX.2-dev-Fun-Controlnet-Union

File size: 4,857 Bytes

7d88b5e

---
library_name: videox_fun
license: other
license_name: flux-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICENSE.txt
---

# Flux.2-dev-Fun-Controlnet-Union

[![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)

## Model Card

| Name | Description |
|--|--|
| FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors | Compared to the previous version of the model, we have added Scribble and Gray controls. Similar to Z-Image-Turbo, the Flux2 model loses its CFG distillation capability after Control training, which is why the previous version performed poorly. Building upon the previous version, we trained on a better dataset and performed CFG distillation after training, resulting in superior performance. |
| FLUX.2-dev-Fun-Controlnet-Union.safetensors | ControlNet weights for Flux2. The model supports multiple control conditions such as Canny, HED, Depth, Pose and MLSD. |

# Model features
- This ControlNet is added on 4 double blocks.
- It supports multiple control conditions—including Canny, HED, Depth, Pose, MLSD, Scribble and Gray can be used like a standard ControlNet.
- Inpainting mode is also supported.
- You can adjust controlnet_conditioning_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
- Although Flux.2‑dev supports certain image‑editing capabilities, its generation speed slows down when handling multiple images, and it sometimes produces similarity issues or fails to follow the control images. Compared with edit‑based methods, using ControlNet adheres more reliably to control instructions and makes it easier to apply multiple types of control.

# Results

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>Pose + Ref</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/pose.jpg" width="100%" /><img src="asset/ref.jpg" width="100%" /></td>
    <td><img src="results/pose_ref.png" width="100%" /></td>
  </tr>
</table>

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>Pose</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/pose.jpg" width="100%" /></td>
    <td><img src="results/pose.png" width="100%" /></td>
  </tr>
</table>

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>Pose</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/pose2.jpg" width="100%" /></td>
    <td><img src="results/pose2.png" width="100%" /></td>
  </tr>
</table>

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>Canny</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/canny.jpg" width="100%" /></td>
    <td><img src="results/canny.png" width="100%" /></td>
  </tr>
</table>

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>HED</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/hed.jpg" width="100%" /></td>
    <td><img src="results/hed.png" width="100%" /></td>
  </tr>
</table>

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>Depth</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/depth.jpg" width="100%" /></td>
    <td><img src="results/depth.png" width="100%" /></td>
  </tr>
</table>

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>Gray</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/gray.jpg" width="100%" /></td>
    <td><img src="results/gray.png" width="100%" /></td>
  </tr>
</table>

<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
  <tr>
    <td>Pose + Inpaint</td>
    <td>Output</td>
  </tr>
  <tr>
    <td><img src="asset/ref.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /><img src="asset/pose.jpg" width="100%" /></td>
    <td><img src="results/pose_inpaint.png" width="100%" /></td>
  </tr>
</table>

# Inference
Go to VideoX-Fun repository for more details.

Please git clone VideoX-Fun and mkdirs.
```sh
# clone code
git clone https://github.com/aigc-apps/VideoX-Fun.git

# enter VideoX-Fun's dir
cd VideoX-Fun

# download weights
mkdir models/Diffusion_Transformer
mkdir models/Personalized_Model
```

Then download weights to models/Diffusion_Transformer and models/Personalized_Model.

```
📦 models/
├── 📂 Diffusion_Transformer/
│   └── 📂 FLUX.2-dev/
├── 📂 Personalized_Model/
│   ├── 📦 FLUX.2-dev-Fun-Controlnet-Union-2602.safetensors
│   └── 📦 FLUX.2-dev-Fun-Controlnet-Union.safetensors
```

Then run the file `examples/flux2_fun/predict_t2i_control.py`.