|
|
--- |
|
|
license: other |
|
|
library_name: diffusers |
|
|
tags: |
|
|
- text-to-image |
|
|
- z-image |
|
|
- diffusers |
|
|
- quantized |
|
|
- int8 |
|
|
- sdnq |
|
|
- safetensors |
|
|
pipeline_tag: text-to-image |
|
|
--- |
|
|
|
|
|
# Tongyi-MAI/Z-Image-Turbo - Quantized (8-bit) |
|
|
|
|
|
## Overview |
|
|
This is a **quantized version** of [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo). |
|
|
|
|
|
All components have been quantized to 8-bit using SDNQ, while preserving the original folder structure for seamless integration. |
|
|
|
|
|
## Architecture |
|
|
- **Pipeline**: ZImagePipeline |
|
|
- **Main component**: ZImageTransformer2DModel |
|
|
- **Quantization**: 8-bit |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from diffusers import ZImagePipeline, AutoencoderKL, FlowMatchEulerDiscreteScheduler |
|
|
from transformers import Qwen3Model, AutoTokenizer |
|
|
from sdnq import load_sdnq_model |
|
|
|
|
|
model_path = "Tongyi-MAI_Z-Image-Turbo-int8" |
|
|
|
|
|
# Load transformer with SDNQ (quantized to 8-bit) |
|
|
transformer = load_sdnq_model( |
|
|
f"{model_path}/transformer", |
|
|
model_cls=ZImageTransformer2DModel, |
|
|
device="cpu" |
|
|
) |
|
|
|
|
|
# Load other components from this model (all included!) |
|
|
vae = AutoencoderKL.from_pretrained(f"{model_path}/vae", torch_dtype=torch.float16) |
|
|
text_encoder = Qwen3Model.from_pretrained(f"{model_path}/text_encoder", torch_dtype=torch.float16) |
|
|
tokenizer = AutoTokenizer.from_pretrained(f"{model_path}/tokenizer") |
|
|
scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(f"{model_path}/scheduler") |
|
|
|
|
|
# Construct pipeline |
|
|
pipe = ZImagePipeline( |
|
|
transformer=transformer, |
|
|
vae=vae, |
|
|
text_encoder=text_encoder, |
|
|
tokenizer=tokenizer, |
|
|
scheduler=scheduler, |
|
|
) |
|
|
|
|
|
pipe.to("cuda") |
|
|
|
|
|
# Generate an image |
|
|
image = pipe( |
|
|
prompt="A serene mountain landscape at sunrise", |
|
|
num_inference_steps=20, |
|
|
).images[0] |
|
|
image.save("output.png") |
|
|
``` |
|
|
|
|
|
## Components |
|
|
|
|
|
- ✅ **transformer** (ZImageTransformer2DModel) - Quantized to 8-bit |
|
|
- ✅ **vae** (AutoencoderKL) - Quantized to 8-bit |
|
|
|
|
|
**Note**: Some components are included unquantized due to SDNQ library limitations: |
|
|
- 📦 **text_encoder** - Included unquantized (SDNQ bug workaround) |
|
|
|
|
|
|
|
|
## Quantization Details |
|
|
- **Original model**: [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) |
|
|
- **Quantization**: 8-bit |
|
|
- **Quantizer**: SDNQ |
|
|
- **Date**: 2026-01-18 13:41:50 |
|
|
|
|
|
## Size Reduction |
|
|
- Original: ~30GB (estimated) |
|
|
- Quantized: See individual component sizes |
|
|
|
|
|
## Notes |
|
|
- This is a complete drop-in replacement - all components included |
|
|
- SDNQ quantization provides excellent quality at reduced size |
|
|
- Requires `sdnq` library to be installed: `pip install sdnq` |
|
|
- Quality loss is minimal with 8-bit quantization |
|
|
- Some components may be included unquantized due to library limitations |
|
|
|
|
|
--- |
|
|
Quantized with [BugQuant](https://github.com/yourusername/BugQuant) |
|
|
|