Note: It appears that the bug with ComfyUI and DF11 FLUX.2 [klein] models does not occur in `diffusers`, I am still investigating the issue with the ComfyUI implementation.

For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

Feel free to request for other models for compression as well (for either the diffusers library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.

How to Use

`diffusers`

import torch
from diffusers import Flux2KleinKVPipeline, Flux2Transformer2DModel
from transformers.modeling_utils import no_init_weights
from dfloat11 import DFloat11Model

device = "cuda"
dtype = torch.bfloat16
model_path = "black-forest-labs/FLUX.2-klein-9b-kv"

text_encoder = DFloat11Model.from_pretrained("DFloat11/Qwen3-8B-DF11", device="cpu")
with no_init_weights():
    transformer = Flux2Transformer2DModel.from_config(
        Flux2Transformer2DModel.load_config(
            "black-forest-labs/FLUX.2-klein-9b-kv", subfolder="transformer"
        ),
        torch_dtype=torch.bfloat16
    ).to(torch.bfloat16)
DFloat11Model.from_pretrained("mingyi456/FLUX.2-klein-9b-kv-DF11", device="cpu", bfloat16_model=transformer)
pipe = Flux2KleinKVPipeline.from_pretrained("", text_encoder=text_encoder, transformer=transformer, torch_dtype=torch.bfloat16)
pipe.to(device)

# Text-to-image (no reference image)
print("Generating text-to-image...")
image = pipe(
    prompt="A cat holding a sign that says hello world",
    height=1024,
    width=1024,
    num_inference_steps=4,
    generator=torch.Generator(device=device).manual_seed(0),
).images[0]
image.save("t2i_output.png")
print("Saved t2i_output.png")

# Image-to-image with KV cache (using the generated image as reference)
print("Generating image-to-image with KV cache...")
image_kv = pipe(
    prompt="A cat dressed like a wizard",
    image=image,
    height=1024,
    width=1024,
    num_inference_steps=4,
    generator=torch.Generator(device=device).manual_seed(0),
).images[0]
image_kv.save("kv_output.png")
print("Saved kv_output.png")

ComfyUI

Refer to this model instead.

Compression details

This is the pattern_dict for compression:

pattern_dict = {
    r"double_stream_modulation_img.linear": [],
    r"double_stream_modulation_txt.linear": [],
    r"single_stream_modulation.linear": [],
    r"context_embedder": [],
    r"transformer_blocks\.\d+" : (
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
        "attn.to_out.0",
        "attn.add_q_proj",
        "attn.add_k_proj",
        "attn.add_v_proj",
        "attn.to_add_out",
        "ff.linear_in",
        "ff.linear_out",
        "ff_context.linear_in",
        "ff_context.linear_out",
    ),
    r"single_transformer_blocks\.\d+" : (
        "attn.to_qkv_mlp_proj",
        "attn.to_out",
    ),
    r"norm_out.linear": [],
}

Downloads last month: 19

Safetensors

Model size

12B params

Tensor type

I64

BF16

Model tree for mingyi456/FLUX.2-klein-9b-kv-DF11

Base model

black-forest-labs/FLUX.2-klein-9b-kv

Quantized

(7)

this model