Quantized FLUX.2 Klein 4B Transformer (ModelOpt)
This repo stores NVIDIA Model Optimizer checkpoints for FLUX.2 Klein 4B transformer quantization variants.
Contents
fp8/transformer_modelopt.ptfp8/transformer_modelopt_meta.jsonw8a8/transformer_modelopt.ptw8a8/transformer_modelopt_meta.jsonnvfp4/transformer_modelopt.ptnvfp4/transformer_modelopt_meta.json
Restore into pipeline
import modelopt.torch.opt as mto
from klein_pipeline import Flux2KleinPipeline
import torch
pipe = Flux2KleinPipeline.from_pretrained(
"black-forest-labs/FLUX.2-klein-4B", torch_dtype=torch.bfloat16
).to("cuda")
ckpt = "fp8/transformer_modelopt.pt" # or w8a8 / nvfp4
mto.restore(pipe.transformer, ckpt)
pipe.transformer.eval()
Uploaded from Modal volume klein4B-assets.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support