FLUX.2-dev ModelOpt FP8 Transformer for SGLang
This repository contains an SGLang-ready FP8 transformer override converted from a ModelOpt diffusers FP8 export.
Scope:
- base model:
black-forest-labs/FLUX.2-dev - quantized component:
transformer - intended usage: SGLang
--transformer-path
Example:
sglang generate --model-path black-forest-labs/FLUX.2-dev --transformer-path BBuf/flux2-dev-modelopt-fp8-sglang-transformer --prompt "A futuristic cyberpunk city at night, neon lights reflecting on wet streets" --width 1024 --height 1024 --seed 42 --text-encoder-cpu-offload true --vae-cpu-offload false --dit-cpu-offload false --dit-layerwise-offload false --save-output
Validated notes:
- H100 Hopper path
- static per-tensor FP8 E4M3 semantics via ModelOpt metadata + SGLang conversion
- 2x H100 TP2 benchmark: about 28.4% total speedup and 30.5% denoise speedup versus BF16
- reduced-smoke trajectory validation: final-step latent cosine about 0.9971
- Downloads last month
- 28
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support