--- license: other library_name: diffusers tags: - text-to-image - z-image - diffusers - quantized - int8 - sdnq - safetensors pipeline_tag: text-to-image --- # Tongyi-MAI/Z-Image-Turbo - Quantized (8-bit) ## Overview This is a **quantized version** of [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo). All components have been quantized to 8-bit using SDNQ, while preserving the original folder structure for seamless integration. ## Architecture - **Pipeline**: ZImagePipeline - **Main component**: ZImageTransformer2DModel - **Quantization**: 8-bit ## Usage ```python import torch from diffusers import ZImagePipeline, AutoencoderKL, FlowMatchEulerDiscreteScheduler from transformers import Qwen3Model, AutoTokenizer from sdnq import load_sdnq_model model_path = "Tongyi-MAI_Z-Image-Turbo-int8" # Load transformer with SDNQ (quantized to 8-bit) transformer = load_sdnq_model( f"{model_path}/transformer", model_cls=ZImageTransformer2DModel, device="cpu" ) # Load other components from this model (all included!) vae = AutoencoderKL.from_pretrained(f"{model_path}/vae", torch_dtype=torch.float16) text_encoder = Qwen3Model.from_pretrained(f"{model_path}/text_encoder", torch_dtype=torch.float16) tokenizer = AutoTokenizer.from_pretrained(f"{model_path}/tokenizer") scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(f"{model_path}/scheduler") # Construct pipeline pipe = ZImagePipeline( transformer=transformer, vae=vae, text_encoder=text_encoder, tokenizer=tokenizer, scheduler=scheduler, ) pipe.to("cuda") # Generate an image image = pipe( prompt="A serene mountain landscape at sunrise", num_inference_steps=20, ).images[0] image.save("output.png") ``` ## Components - ✅ **transformer** (ZImageTransformer2DModel) - Quantized to 8-bit - ✅ **vae** (AutoencoderKL) - Quantized to 8-bit **Note**: Some components are included unquantized due to SDNQ library limitations: - 📦 **text_encoder** - Included unquantized (SDNQ bug workaround) ## Quantization Details - **Original model**: [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) - **Quantization**: 8-bit - **Quantizer**: SDNQ - **Date**: 2026-01-18 13:41:50 ## Size Reduction - Original: ~30GB (estimated) - Quantized: See individual component sizes ## Notes - This is a complete drop-in replacement - all components included - SDNQ quantization provides excellent quality at reduced size - Requires `sdnq` library to be installed: `pip install sdnq` - Quality loss is minimal with 8-bit quantization - Some components may be included unquantized due to library limitations --- Quantized with [BugQuant](https://github.com/yourusername/BugQuant)