FLUX.2-dev ModelOpt FP8 Transformer for SGLang

This repository contains an SGLang-ready FP8 transformer override converted from a ModelOpt diffusers FP8 export.

Scope:

  • base model: black-forest-labs/FLUX.2-dev
  • quantized component: transformer
  • intended usage: SGLang --transformer-path

Example:

sglang generate           --model-path black-forest-labs/FLUX.2-dev           --transformer-path BBuf/flux2-dev-modelopt-fp8-sglang-transformer           --prompt "A futuristic cyberpunk city at night, neon lights reflecting on wet streets"           --width 1024 --height 1024 --seed 42           --text-encoder-cpu-offload true           --vae-cpu-offload false           --dit-cpu-offload false           --dit-layerwise-offload false           --save-output

Validated notes:

  • H100 Hopper path
  • static per-tensor FP8 E4M3 semantics via ModelOpt metadata + SGLang conversion
  • 2x H100 TP2 benchmark: about 28.4% total speedup and 30.5% denoise speedup versus BF16
  • reduced-smoke trajectory validation: final-step latent cosine about 0.9971
Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support