File size: 2,732 Bytes
8622ebe |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
---
license: other
library_name: diffusers
tags:
- text-to-image
- z-image
- diffusers
- quantized
- int8
- sdnq
- safetensors
pipeline_tag: text-to-image
---
# Tongyi-MAI/Z-Image-Turbo - Quantized (8-bit)
## Overview
This is a **quantized version** of [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo).
All components have been quantized to 8-bit using SDNQ, while preserving the original folder structure for seamless integration.
## Architecture
- **Pipeline**: ZImagePipeline
- **Main component**: ZImageTransformer2DModel
- **Quantization**: 8-bit
## Usage
```python
import torch
from diffusers import ZImagePipeline, AutoencoderKL, FlowMatchEulerDiscreteScheduler
from transformers import Qwen3Model, AutoTokenizer
from sdnq import load_sdnq_model
model_path = "Tongyi-MAI_Z-Image-Turbo-int8"
# Load transformer with SDNQ (quantized to 8-bit)
transformer = load_sdnq_model(
f"{model_path}/transformer",
model_cls=ZImageTransformer2DModel,
device="cpu"
)
# Load other components from this model (all included!)
vae = AutoencoderKL.from_pretrained(f"{model_path}/vae", torch_dtype=torch.float16)
text_encoder = Qwen3Model.from_pretrained(f"{model_path}/text_encoder", torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(f"{model_path}/tokenizer")
scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(f"{model_path}/scheduler")
# Construct pipeline
pipe = ZImagePipeline(
transformer=transformer,
vae=vae,
text_encoder=text_encoder,
tokenizer=tokenizer,
scheduler=scheduler,
)
pipe.to("cuda")
# Generate an image
image = pipe(
prompt="A serene mountain landscape at sunrise",
num_inference_steps=20,
).images[0]
image.save("output.png")
```
## Components
- ✅ **transformer** (ZImageTransformer2DModel) - Quantized to 8-bit
- ✅ **vae** (AutoencoderKL) - Quantized to 8-bit
**Note**: Some components are included unquantized due to SDNQ library limitations:
- 📦 **text_encoder** - Included unquantized (SDNQ bug workaround)
## Quantization Details
- **Original model**: [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo)
- **Quantization**: 8-bit
- **Quantizer**: SDNQ
- **Date**: 2026-01-18 13:41:50
## Size Reduction
- Original: ~30GB (estimated)
- Quantized: See individual component sizes
## Notes
- This is a complete drop-in replacement - all components included
- SDNQ quantization provides excellent quality at reduced size
- Requires `sdnq` library to be installed: `pip install sdnq`
- Quality loss is minimal with 8-bit quantization
- Some components may be included unquantized due to library limitations
---
Quantized with [BugQuant](https://github.com/yourusername/BugQuant)
|