Z-Image-Turbo (FP8 E5M2 & E4M3FN)

This is a quantization of Tongyi-MAI/Z-Image-Turbo to FP8 E5M2 and FP8 E4M3FN.

License & Usage: This model strictly follows the original licensing terms and usage restrictions. Please refer to the original model card for details.

Files in this repo

Full Diffusers pipeline copied from Tongyi-MAI/Z-Image (cached snapshot) with FP8 transformer weights.
Default transformer weights: transformer/diffusion_pytorch_model.safetensors (E4M3FN).
Alternate transformer weights: transformer/diffusion_pytorch_model_e5m2.safetensors (E5M2).

To switch variants, load the pipeline and replace the transformer weights from the alternate file in transformer/.

Requirements

PyTorch with CUDA support (tested with 2.10.0+cu130)
Diffusers (latest main recommended)
For FP8 execution: NVIDIA Transformer Engine (TE) built for your CUDA + Python version

FP8 execution (Transformer Engine)

The sample script create-image.py uses NVIDIA Transformer Engine (TE) to run FP8 kernels on supported GPUs (e.g., Blackwell). Install TE in your environment and run the script from this repo directory.

BF16 fallback (no FP8 kernels)

For GPUs without FP8 kernel support (or if TE is unavailable), use create-image-bf16.py. It loads the same FP8 weights but casts to BF16 for compute so it runs everywhere (at lower speed vs true FP8).

Usage

After downloading, the scripts default to MODEL_ID=ykarout/Z-Image-Turbo-FP8-Full.
To force local loading, set USE_LOCAL=1.

Downloads last month: 301

Model tree for ykarout/Z-Image-Turbo-FP8-Full

Base model

Tongyi-MAI/Z-Image-Turbo

Quantized

(60)

this model