--- library_name: diffusers pipeline_tag: text-to-image base_model: - black-forest-labs/FLUX.2-klein-9B base_model_relation: quantized tags: - text-to-image - image-editing - diffusion - quantized - quantfunc - flux language: - en license: other license_name: flux-non-commercial-license license_link: LICENSE --- ## โ ๏ธ License โ Non-Commercial Use Only These are **quantized derivative weights** of [`black-forest-labs/FLUX.2-klein-9B`](https://huggingface.co/black-forest-labs/FLUX.2-klein-9B) (**FLUX.2 [klein] 9B**), which is licensed under the **FLUX Non-Commercial License v2.1** by Black Forest Labs. > This FLUX Model is licensed by Black Forest Labs Inc. under the FLUX Non-Commercial License. - **Non-commercial use only.** These weights may **not** be used for any commercial or revenue-generating purpose. Commercial use requires a separate license from Black Forest Labs โ see https://bfl.ai/licensing . - **Full license:** included as [`LICENSE`](./LICENSE) (FLUX Non-Commercial License v2.1). - **Modifications:** quantized from FLUX.2 [klein] 9B by the QuantFunc inference engine. - This is **not** an official Black Forest Labs product and is not endorsed by BFL. > **Disclaimer:** Derived from FLUX.2 [klein] by Black Forest Labs. This is **not an official Black Forest Labs product** and is not endorsed by or affiliated with BFL. "FLUX" is a trademark of Black Forest Labs. # QuantFunc
๐ค Hugging Face | ๐ค ModelScope | ๐ป GitHub | ๐ฌ WeChat (ๅพฎไฟก) | ๐ฎ Discord
> โก **FLUX.2 Klein 9B โ the highest-quality Klein tier, pre-quantized.** Text-to-image and reference-based editing at **2xโ11x** with the QuantFunc plugin. The larger **9B** Klein model for maximum fidelity, shipped as **distilled (4-step)** + **base (28-step)** transformers across **three GPU tiers** (`50x` FP4 ยท `40x` INT4+FP8 ยท `30x-below` INT4+INT8). **Powered by the [QuantFunc ComfyUI plugin](https://github.com/RealJonathanYip/ComfyUI-QuantFunc) โ the fastest diffusion inference engine:** - ๐ **2xโ11x speedup** over standard BF16/FP16 Python pipelines (pre-exported โ even faster loading). - โ๏ธ **Native C++/CUDA** (`libquantfunc.so` / `quantfunc.dll`) with **zero Python model dependencies**. - ๐งฉ **Dual engine** (SVDQ offline + Lighting runtime 4-bit), **zero-cost LoRA stacking**, reference-image editing & inpainting. - ๐ข **Full GPU coverage** โ RTX 20/30/40/50 ยท A100/H100/H200/B100/B200/GB300 ยท RTX 6000 Ada / PRO Blackwell (CUDA 12 & 13); native **FP4** on Blackwell. ๐ **Install the plugin:** **https://github.com/RealJonathanYip/ComfyUI-QuantFunc** # Klein-9B-Series Pre-quantized **FLUX.2 Klein 9B** model series by [QuantFunc](https://github.com/RealJonathanYip), Lighting backend. Text-to-image and reference-based image editing. > โจ **Both the distilled AND the non-distilled (base) model are supported**, and the series ships **three GPU tiers** so every card gets the best path it can run: > **`50x`** (Blackwell, FP4) ยท **`40x`** (RTX 40 / Ada & Hopper, INT4 + FP8) ยท **`30x-below`** (RTX 30 and below, INT4 + INT8). ## Overview FLUX.2 Klein is Black Forest Labs' Flux.2 family. The **9B** variant (the larger, higher-quality variant, transformer K=4096). QuantFunc ships, pre-quantized: - **Distilled** transformer โ 4-step, fastest few-step generation/editing. - **Base / non-distilled** transformer โ the full 28-step model with classical CFG (`--guidance-scale 4.0`), highest quality. โฆeach in 3 hardware tiers (below). Distilled and base **share the same base-model** โ only the transformer file differs. ## Hardware tiers (pick by GPU) FP4 needs Blackwell (SM120); FP8 needs Ada (SM89) or Hopper (SM90) โ e.g. RTX 40 / L40 / H100 / H200; INT4/INT8 run everywhere (Ampere/Turing, e.g. RTX 30/20, A100). So: | Tier | GPUs | attention + FFN | modulation/embedders/head | base-model | |------|------|-----------------|---------------------------|-----------| | **`50x`** | **Blackwell (SM120+)** โ RTX 50 series, B100/B200/GB200, RTX PRO Blackwell | **FP4** | **FP8** | `klein-9b-series-50x-above-base-model` (FP4 text encoder) | | **`40x`** | **RTX 40 / Ada (SM89) & Hopper (SM90)** โ RTX 40 series, L40/L40S, **H100, H200** | **INT4** | **FP8** | `klein-9b-series-50x-below-base-model` (INT4 text encoder) | | **`30x-below`** | **RTX 30 and below (pre-FP8)** โ RTX 30/20, A100, A40, T4, down to RTX 2080 | **INT4** | **INT8** | `klein-9b-series-50x-below-base-model` (INT4 text encoder) | > `40x` and `30x-below` **share** the same INT4 base-model โ they differ only in the transformer's 8-bit precision (FP8 vs INT8). `50x` uses the FP4 base-model. ## Directory Structure ``` Klein-9B-Series/ โโโ klein-9b-series-50x-above-base-model/ # FP4 text encoder + VAE(enc+dec) + tokenizer + scheduler (50x) โโโ klein-9b-series-50x-below-base-model/ # INT4 text encoder + VAE(enc+dec) + tokenizer + scheduler (40x & 30x-below) โโโ transformer/ โ โโโ config.json โ โโโ klein-9b-50x-lighting.safetensors # distilled, FP4 (50x) โ โโโ klein-9b-base-50x-lighting.safetensors # base 28-step, FP4 (50x) โ โโโ klein-9b-40x-lighting.safetensors # distilled, INT4 + FP8 (40x) โ โโโ klein-9b-base-40x-lighting.safetensors # base 28-step, INT4 + FP8(40x) โ โโโ klein-9b-30x-below-lighting.safetensors # distilled, INT4 + INT8 (30x-below) โ โโโ klein-9b-base-30x-below-lighting.safetensors # base 28-step, INT4 + INT8(30x-below) โโโ precision-config/ โโโ 50x-fp4-f8-sample.json โโโ 40x-int4-f8-sample.json โโโ 30x-below-int4-i8-sample.json ``` > **Status:** โ All weights uploaded; the VAE includes **both encoder and decoder**. Every tier ร {distilled, base} is visually validated to generate correctly. ## Distilled (4-step) vs Base (28-step) | Transformer | Source | Steps | Guidance | Best for | |---|---|---|---|---| | `klein-9b-