Text-to-Image
Diffusers
Safetensors
English
Flux2KleinPipeline
flux2_klein
image-editing
diffusion
quantized
quantfunc
flux
Instructions to use QuantFunc/Klein-9B-Series with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use QuantFunc/Klein-9B-Series with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("QuantFunc/Klein-9B-Series", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
| library_name: diffusers | |
| pipeline_tag: text-to-image | |
| base_model: | |
| - black-forest-labs/FLUX.2-klein-9B | |
| base_model_relation: quantized | |
| tags: | |
| - text-to-image | |
| - image-editing | |
| - diffusion | |
| - quantized | |
| - quantfunc | |
| - flux | |
| language: | |
| - en | |
| license: other | |
| license_name: flux-non-commercial-license | |
| license_link: LICENSE | |
| <!-- QF-LICENSE-BLOCK:START --> | |
| ## โ ๏ธ License โ Non-Commercial Use Only | |
| These are **quantized derivative weights** of [`black-forest-labs/FLUX.2-klein-9B`](https://huggingface.co/black-forest-labs/FLUX.2-klein-9B) (**FLUX.2 [klein] 9B**), which is | |
| licensed under the **FLUX Non-Commercial License v2.1** by Black Forest Labs. | |
| > This FLUX Model is licensed by Black Forest Labs Inc. under the FLUX Non-Commercial License. | |
| - **Non-commercial use only.** These weights may **not** be used for any commercial or | |
| revenue-generating purpose. Commercial use requires a separate license from Black Forest | |
| Labs โ see https://bfl.ai/licensing . | |
| - **Full license:** included as [`LICENSE`](./LICENSE) (FLUX Non-Commercial License v2.1). | |
| - **Modifications:** quantized from FLUX.2 [klein] 9B by the QuantFunc inference engine. | |
| - This is **not** an official Black Forest Labs product and is not endorsed by BFL. | |
| > **Disclaimer:** Derived from FLUX.2 [klein] by Black Forest Labs. This is **not an official Black Forest Labs product** and is not endorsed by or affiliated with BFL. "FLUX" is a trademark of Black Forest Labs. | |
| <!-- QF-LICENSE-BLOCK:END --> | |
| # QuantFunc | |
| <div align="center" style="margin-top: 50px;"> | |
| <img src="assets/logo.webp" width="300" alt="Logo"> | |
| </div> | |
| <p align="center"> | |
| ๐ค <a href="https://huggingface.co/QuantFunc">Hugging Face</a> | | |
| ๐ค <a href="https://www.modelscope.cn/profile/QuantFunc">ModelScope</a> | | |
| ๐ป <a href="https://github.com/RealJonathanYip/ComfyUI-QuantFunc">GitHub</a> | | |
| ๐ฌ <a href="#wechat">WeChat (ๅพฎไฟก)</a> | | |
| ๐ฎ <a href="https://discord.gg/jCp9TpFWcn">Discord</a> | |
| </p> | |
| > โก **FLUX.2 Klein 9B โ the highest-quality Klein tier, pre-quantized.** Text-to-image and reference-based editing at **2xโ11x** with the QuantFunc plugin. | |
| The larger **9B** Klein model for maximum fidelity, shipped as **distilled (4-step)** + **base (28-step)** transformers across **three GPU tiers** (`50x` FP4 ยท `40x` INT4+FP8 ยท `30x-below` INT4+INT8). | |
| **Powered by the [QuantFunc ComfyUI plugin](https://github.com/RealJonathanYip/ComfyUI-QuantFunc) โ the fastest diffusion inference engine:** | |
| - ๐ **2xโ11x speedup** over standard BF16/FP16 Python pipelines (pre-exported โ even faster loading). | |
| - โ๏ธ **Native C++/CUDA** (`libquantfunc.so` / `quantfunc.dll`) with **zero Python model dependencies**. | |
| - ๐งฉ **Dual engine** (SVDQ offline + Lighting runtime 4-bit), **zero-cost LoRA stacking**, reference-image editing & inpainting. | |
| - ๐ข **Full GPU coverage** โ RTX 20/30/40/50 ยท A100/H100/H200/B100/B200/GB300 ยท RTX 6000 Ada / PRO Blackwell (CUDA 12 & 13); native **FP4** on Blackwell. | |
| ๐ **Install the plugin:** **https://github.com/RealJonathanYip/ComfyUI-QuantFunc** | |
| # Klein-9B-Series | |
| Pre-quantized **FLUX.2 Klein 9B** model series by [QuantFunc](https://github.com/RealJonathanYip), Lighting backend. Text-to-image and reference-based image editing. | |
| > โจ **Both the distilled AND the non-distilled (base) model are supported**, and the series ships **three GPU tiers** so every card gets the best path it can run: | |
| > **`50x`** (Blackwell, FP4) ยท **`40x`** (RTX 40 / Ada & Hopper, INT4 + FP8) ยท **`30x-below`** (RTX 30 and below, INT4 + INT8). | |
| ## Overview | |
| FLUX.2 Klein is Black Forest Labs' Flux.2 family. The **9B** variant (the larger, higher-quality variant, transformer K=4096). QuantFunc ships, pre-quantized: | |
| - **Distilled** transformer โ 4-step, fastest few-step generation/editing. | |
| - **Base / non-distilled** transformer โ the full 28-step model with classical CFG (`--guidance-scale 4.0`), highest quality. | |
| โฆeach in 3 hardware tiers (below). Distilled and base **share the same base-model** โ only the transformer file differs. | |
| ## Hardware tiers (pick by GPU) | |
| FP4 needs Blackwell (SM120); FP8 needs Ada (SM89) or Hopper (SM90) โ e.g. RTX 40 / L40 / H100 / H200; INT4/INT8 run everywhere (Ampere/Turing, e.g. RTX 30/20, A100). So: | |
| | Tier | GPUs | attention + FFN | modulation/embedders/head | base-model | | |
| |------|------|-----------------|---------------------------|-----------| | |
| | **`50x`** | **Blackwell (SM120+)** โ RTX 50 series, B100/B200/GB200, RTX PRO Blackwell | **FP4** | **FP8** | `klein-9b-series-50x-above-base-model` (FP4 text encoder) | | |
| | **`40x`** | **RTX 40 / Ada (SM89) & Hopper (SM90)** โ RTX 40 series, L40/L40S, **H100, H200** | **INT4** | **FP8** | `klein-9b-series-50x-below-base-model` (INT4 text encoder) | | |
| | **`30x-below`** | **RTX 30 and below (pre-FP8)** โ RTX 30/20, A100, A40, T4, down to RTX 2080 | **INT4** | **INT8** | `klein-9b-series-50x-below-base-model` (INT4 text encoder) | | |
| > `40x` and `30x-below` **share** the same INT4 base-model โ they differ only in the transformer's 8-bit precision (FP8 vs INT8). `50x` uses the FP4 base-model. | |
| ## Directory Structure | |
| ``` | |
| Klein-9B-Series/ | |
| โโโ klein-9b-series-50x-above-base-model/ # FP4 text encoder + VAE(enc+dec) + tokenizer + scheduler (50x) | |
| โโโ klein-9b-series-50x-below-base-model/ # INT4 text encoder + VAE(enc+dec) + tokenizer + scheduler (40x & 30x-below) | |
| โโโ transformer/ | |
| โ โโโ config.json | |
| โ โโโ klein-9b-50x-lighting.safetensors # distilled, FP4 (50x) | |
| โ โโโ klein-9b-base-50x-lighting.safetensors # base 28-step, FP4 (50x) | |
| โ โโโ klein-9b-40x-lighting.safetensors # distilled, INT4 + FP8 (40x) | |
| โ โโโ klein-9b-base-40x-lighting.safetensors # base 28-step, INT4 + FP8(40x) | |
| โ โโโ klein-9b-30x-below-lighting.safetensors # distilled, INT4 + INT8 (30x-below) | |
| โ โโโ klein-9b-base-30x-below-lighting.safetensors # base 28-step, INT4 + INT8(30x-below) | |
| โโโ precision-config/ | |
| โโโ 50x-fp4-f8-sample.json | |
| โโโ 40x-int4-f8-sample.json | |
| โโโ 30x-below-int4-i8-sample.json | |
| ``` | |
| > **Status:** โ All weights uploaded; the VAE includes **both encoder and decoder**. Every tier ร {distilled, base} is visually validated to generate correctly. | |
| ## Distilled (4-step) vs Base (28-step) | |
| | Transformer | Source | Steps | Guidance | Best for | | |
| |---|---|---|---|---| | |
| | `klein-9b-<tier>-lighting.safetensors` | Klein **distilled** | 4 | none (guidance-distilled) | Fastest | | |
| | `klein-9b-base-<tier>-lighting.safetensors` | Klein **base** | 28 | `--guidance-scale 4.0` (classical CFG) | Highest quality | | |
| ## Inference | |
| ```bash | |
| # 50x โ Blackwell (RTX 50 / B-series). Distilled, 4-step: | |
| quantfunc --model-dir klein-9b-series-50x-above-base-model \ | |
| --transformer transformer/klein-9b-50x-lighting.safetensors \ | |
| --model-backend lighting --auto-optimize --steps 4 \ | |
| --prompt "a cute cat on a windowsill, watercolor style" --output out.png | |
| # 40x โ RTX 40 / Ada or Hopper (H100/H200). Base 28-step (classical CFG): | |
| quantfunc --model-dir klein-9b-series-50x-below-base-model \ | |
| --transformer transformer/klein-9b-base-40x-lighting.safetensors \ | |
| --model-backend lighting --auto-optimize --steps 28 --guidance-scale 4.0 \ | |
| --prompt "a cute cat on a windowsill, watercolor style" --output out.png | |
| # 30x-below โ RTX 30 and below. Distilled, 4-step: | |
| quantfunc --model-dir klein-9b-series-50x-below-base-model \ | |
| --transformer transformer/klein-9b-30x-below-lighting.safetensors \ | |
| --model-backend lighting --auto-optimize --steps 4 \ | |
| --prompt "a cute cat on a windowsill, watercolor style" --output out.png | |
| ``` | |
| `--auto-optimize` picks the VRAM/attention/compression strategy for your GPU. The ComfyUI Lighting plugin auto-selects the matching tier + precision-config. | |
| ## Precision Config (precision-config/) | |
| | File | Tier / GPU | attention+FFN | islands | | |
| |------|-----------|---------------|---------| | |
| | `50x-fp4-f8-sample.json` | 50x โ Blackwell (SM120+) | FP4 | FP8 | | |
| | `40x-int4-f8-sample.json` | 40x โ Ada (SM89) & Hopper (SM90): RTX 40, L40, H100, H200 | INT4 | FP8 | | |
| | `30x-below-int4-i8-sample.json` | 30x-below โ RTX 30/20, A100 (pre-FP8) | INT4 | INT8 | | |
| These per-layer configs control the Lighting backend's quantization precision โ customize for your own speed/quality trade-off. | |
| ## Related Repositories | |
| - [QuantFunc/Klein-4B-Series](https://huggingface.co/QuantFunc/Klein-4B-Series) โ FLUX.2 Klein 4B | |
| - [QuantFunc/Qwen-Image-Series](https://huggingface.co/QuantFunc/Qwen-Image-Series) ยท [QuantFunc/Qwen-Image-Edit-Series](https://huggingface.co/QuantFunc/Qwen-Image-Edit-Series) ยท [QuantFunc/Z-Image-Series](https://huggingface.co/QuantFunc/Z-Image-Series) | |
| ## License | |
| The pre-quantized weights are derived from FLUX.2 Klein. Users must comply with the original Black Forest Labs FLUX.2 license. The QuantFunc inference engine and plugins are licensed separately. | |
| ## Community | |
| Join our community for support, updates, and discussions: | |
| - ๐ฎ [Discord server](https://discord.gg/jCp9TpFWcn) | |
| - ๐ฌ Scan the QR code below to join our WeChat group: | |
| <div align="center" id="wechat"> | |
| <img src="https://raw.githubusercontent.com/RealJonathanYip/ComfyUI-QuantFunc/main/assets/WeChat.jpg" alt="WeChat Group" width="300"> | |
| </div> | |