Flux.2 MultiDiffusion Modular Blocks

This repo provides self-contained Hugging Face Modular Diffusers blocks for Flux.2 MultiDiffusion.

The root files are the important ones:

block.py: custom Flux.2 block implementation and Hugging Face remote-code entrypoint.
repository.py: family selection and Hub export helpers.
cli.py: multidiff-modular command implementation.
model/<family>/: generated metadata JSON for each supported Flux.2 family.
examples/: runnable local and remote generation scripts.

The easiest way to test the repo is remote loading from Hugging Face with trust_remote_code=True. No local package install is required for the remote-code path, but you still need compatible Diffusers, Torch, Transformers, and model access.

Install

Python >=3.14,<3.15 is expected.

Development install with uv:

uv sync --all-extras

If the default uv cache is not writable:

UV_CACHE_DIR=/private/tmp/uv-cache uv sync --all-extras

Editable install with pip:

python -m pip install -e ".[dev,examples,quantization]"

Minimal local install:

python -m pip install .

Optional extras:

dev: pytest and ruff.
examples: Pillow for examples and image/mask handling.
quantization: TorchAO for --quantization.

Remote Loading

This is the preferred quick smoke test. It loads the default root metadata, currently flux2-klein-4b, and fetches the custom blocks from root block.py.

import torch
from diffusers import ComponentsManager, ModularPipeline

repo_id = "arlaz/modular-flux2-multidiffusion"

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if device == "cuda" else torch.float32

if device == "cuda":
    torch.backends.fp32_precision = "tf32"
    torch.set_float32_matmul_precision("high")

manager = ComponentsManager()
if device == "cuda":
    manager.enable_auto_cpu_offload(device=device)

pipe = ModularPipeline.from_pretrained(
    repo_id,
    trust_remote_code=True,
    components_manager=manager,
)
pipe.load_components(torch_dtype=dtype)

if device != "cuda":
    pipe.to(device)

out = pipe(
    prompt="a dense renaissance fresco",
    height=1024,
    width=1024,
    height_generation=512,
    width_generation=512,
    window_stride_height=512,
    window_stride_width=512,
    weighting_type="cosine",
    num_inference_steps=1,
    batch_size=1,
    generator=torch.Generator().manual_seed(42),
)

out.images[0].save("remote_smoke_1024.png")
print(out.images[0].size)

Equivalent runner:

uv run python examples/example_remote.py \
  --repo-id arlaz/modular-flux2-multidiffusion \
  --prompt "a dense renaissance fresco" \
  --height 1024 \
  --width 1024 \
  --height-generation 512 \
  --width-generation 512 \
  --window-stride-height 512 \
  --window-stride-width 512 \
  --num-inference-steps 1 \
  --dtype bfloat16 \
  --device cuda \
  --output remote_smoke_1024.png

Local Python Use

When working from a checkout or editable install, import the root helper modules directly:

import torch
from diffusers import ComponentsManager
from repository import init_default_multidiffusion_pipeline

manager = ComponentsManager()
manager.enable_auto_cpu_offload(device="cuda")

pipe = init_default_multidiffusion_pipeline("flux2-klein-4b", components_manager=manager)
pipe.load_components(torch_dtype=torch.bfloat16)

image = pipe(
    prompt="a dense renaissance fresco",
    height=2048,
    width=2048,
    height_generation=1024,
    width_generation=1024,
    window_stride_height=512,
    window_stride_width=512,
    num_inference_steps=4,
).images[0]

For CLI examples, regional masks, panorama mode, img2img, image conditioning, quantization, and all runner arguments, see examples/README.md.

Flux.2 Families

Family	Upstream model repo	Blocks
`flux2`	`black-forest-labs/FLUX.2-dev`	`Flux2MultiDiffusionAutoBlocks`
`flux2-klein-4b`	`black-forest-labs/FLUX.2-klein-4B`	`Flux2KleinMultiDiffusionAutoBlocks`
`flux2-klein-base-4B`	`black-forest-labs/FLUX.2-klein-base-4B`	`Flux2KleinBaseMultiDiffusionAutoBlocks`
`flux2-klein-9b`	`black-forest-labs/FLUX.2-klein-9B`	`Flux2KleinMultiDiffusionAutoBlocks`
`flux2-klein-base-9B`	`black-forest-labs/FLUX.2-klein-base-9B`	`Flux2KleinBaseMultiDiffusionAutoBlocks`

The root metadata is the default flux2-klein-4b export. The model/<family>/ folders are metadata-only snapshots for the other families; they are not standalone remote-code repos.

Export Metadata

Refresh the root default metadata:

uv run multidiff-modular export-hf . --family flux2-klein-4b

Refresh a metadata-only family folder:

uv run multidiff-modular export-hf model/flux2-klein-base-9B --family flux2-klein-base-9B

Create an external single-family Hub repo folder. This copies root block.py beside the generated JSON:

uv run multidiff-modular export-hf hf_export --family flux2-klein-4b

Push an external export directly:

uv run multidiff-modular export-hf hf_export \
  --family flux2-klein-4b \
  --repo-id <user>/<repo> \
  --push-to-hub

Equivalent script wrappers:

uv run python scripts/export_hf_repo.py hf_export --family flux2-klein-4b
uv run python scripts/inspect_panorama.py panorama.png

Hugging Face Setup

No model weights are stored here. Components are loaded from the upstream Flux.2 repos referenced by the generated metadata. Authenticate before using gated or private models:

hf auth login

Prefill the cache for one upstream family:

hf download black-forest-labs/FLUX.2-klein-4B

Use --local-files-only only after all required upstream model files are cached or when pointing to a complete local model directory.

Features

Tiled/windowed denoising over large canvases.
Batched windows through batch_size or --batch-size.
Regional prompting with grayscale masks blended during denoising.
Panorama width/height wrapping plus circular VAE decode padding.
Img2img and Flux.2 image conditioning.
Optional LoRA, TorchAO quantization, VAE tiling/slicing, TF32, and torch.compile.

All window and stride values must be positive and divisible by vae_scale_factor * 2.

Development

uv sync --all-extras
uv run ruff check .
uv run ruff format --check .
uv run pytest -ra
uv run python -m compileall block.py repository.py cli.py examples scripts tests model

Real model smoke tests are opt-in:

MULTIDIFF_MODULAR_RUN_REAL_INFERENCE=1 uv run pytest -ra

Dev Container

The repo includes an Ubuntu-based dev container for VS Code Dev Containers and JetBrains PyCharm Dev Containers. It mounts the workspace, exposes NVIDIA GPUs, and shares the host Hugging Face cache.

Downloads last month: -