Quantized FLUX.2 Klein 4B Transformer (ModelOpt)

This repo stores NVIDIA Model Optimizer checkpoints for FLUX.2 Klein 4B transformer quantization variants.

Contents

  • fp8/transformer_modelopt.pt
  • fp8/transformer_modelopt_meta.json
  • w8a8/transformer_modelopt.pt
  • w8a8/transformer_modelopt_meta.json
  • nvfp4/transformer_modelopt.pt
  • nvfp4/transformer_modelopt_meta.json

Restore into pipeline

import modelopt.torch.opt as mto
from klein_pipeline import Flux2KleinPipeline
import torch

pipe = Flux2KleinPipeline.from_pretrained(
    "black-forest-labs/FLUX.2-klein-4B", torch_dtype=torch.bfloat16
).to("cuda")

ckpt = "fp8/transformer_modelopt.pt"  # or w8a8 / nvfp4
mto.restore(pipe.transformer, ckpt)
pipe.transformer.eval()

Uploaded from Modal volume klein4B-assets.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support