Z-Image-Turbo (FP8 E5M2 & E4M3FN)
This is a quantization of Tongyi-MAI/Z-Image-Turbo to FP8 E5M2 and FP8 E4M3FN.
License & Usage: This model strictly follows the original licensing terms and usage restrictions. Please refer to the original model card for details.
Files in this repo
- Full Diffusers pipeline copied from
Tongyi-MAI/Z-Image(cached snapshot) with FP8 transformer weights. - Default transformer weights:
transformer/diffusion_pytorch_model.safetensors(E4M3FN). - Alternate transformer weights:
transformer/diffusion_pytorch_model_e5m2.safetensors(E5M2).
To switch variants, load the pipeline and replace the transformer weights from the alternate file in transformer/.
Requirements
- PyTorch with CUDA support (tested with 2.10.0+cu130)
- Diffusers (latest
mainrecommended) - For FP8 execution: NVIDIA Transformer Engine (TE) built for your CUDA + Python version
FP8 execution (Transformer Engine)
The sample script create-image.py uses NVIDIA Transformer Engine (TE) to run FP8 kernels on supported GPUs (e.g., Blackwell).
Install TE in your environment and run the script from this repo directory.
BF16 fallback (no FP8 kernels)
For GPUs without FP8 kernel support (or if TE is unavailable), use create-image-bf16.py. It loads the same FP8 weights
but casts to BF16 for compute so it runs everywhere (at lower speed vs true FP8).
Usage
- After downloading, the scripts default to
MODEL_ID=ykarout/Z-Image-Turbo-FP8-Full. - To force local loading, set
USE_LOCAL=1.
- Downloads last month
- 55
Model tree for ykarout/Z-Image-Turbo-FP8-Full
Base model
Tongyi-MAI/Z-Image-Turbo