Text-to-Image
Diffusers
Safetensors
English
ZImagePipeline
.gitattributes CHANGED
@@ -43,4 +43,3 @@ assets/showcase_editing.png filter=lfs diff=lfs merge=lfs -text
43
  assets/showcase_realistic.png filter=lfs diff=lfs merge=lfs -text
44
  assets/showcase_rendering.png filter=lfs diff=lfs merge=lfs -text
45
  assets/Z-Image-Gallery.pdf filter=lfs diff=lfs merge=lfs -text
46
- assets/leaderboard.png filter=lfs diff=lfs merge=lfs -text
 
43
  assets/showcase_realistic.png filter=lfs diff=lfs merge=lfs -text
44
  assets/showcase_rendering.png filter=lfs diff=lfs merge=lfs -text
45
  assets/Z-Image-Gallery.pdf filter=lfs diff=lfs merge=lfs -text
 
README.md CHANGED
@@ -11,16 +11,15 @@ library_name: diffusers
11
 
12
  <div align="center">
13
 
14
- [![Official Site](https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage)](https://tongyi-mai.github.io/Z-Image-blog/)&#160;
15
  [![GitHub](https://img.shields.io/badge/GitHub-Z--Image-181717?logo=github&logoColor=white)](https://github.com/Tongyi-MAI/Z-Image)&#160;
16
  [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Checkpoint-Z--Image--Turbo-yellow)](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo)&#160;
17
  [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Online_Demo-Z--Image--Turbo-blue)](https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo)&#160;
18
- [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Mobile_Demo-Z--Image--Turbo-red)](https://huggingface.co/spaces/akhaliq/Z-Image-Turbo)&#160;
19
  [![ModelScope Model](https://img.shields.io/badge/🤖%20Checkpoint-Z--Image--Turbo-624aff)](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo)&#160;
20
- [![ModelScope Space](https://img.shields.io/badge/🤖%20Online_Demo-Z--Image--Turbo-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=469191&modelType=Checkpoint&sdVersion=Z_IMAGE_TURBO&modelUrl=modelscope%3A%2F%2FTongyi-MAI%2FZ-Image-Turbo%3Frevision%3Dmaster)&#160;
21
  [![Art Gallery PDF](https://img.shields.io/badge/%F0%9F%96%BC%20Art_Gallery-PDF-ff69b4)](assets/Z-Image-Gallery.pdf)&#160;
22
  [![Web Art Gallery](https://img.shields.io/badge/%F0%9F%8C%90%20Web_Art_Gallery-online-00bfff)](https://modelscope.cn/studios/Tongyi-MAI/Z-Image-Gallery/summary)&#160;
23
- <a href="https://arxiv.org/abs/2511.22699" target="_blank"><img src="https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv" height="21px"></a>
24
 
25
 
26
  Welcome to the official repository for the Z-Image(造相)project!
@@ -31,24 +30,21 @@ Welcome to the official repository for the Z-Image(造相)project!
31
 
32
  ## ✨ Z-Image
33
 
34
- Z-Image is a powerful and highly efficient image generation model family with **6B** parameters. Currently there are four variants:
35
 
36
  - 🚀 **Z-Image-Turbo** – A distilled version of Z-Image that matches or exceeds leading competitors with only **8 NFEs** (Number of Function Evaluations). It offers **⚡️sub-second inference latency⚡️** on enterprise-grade H800 GPUs and fits comfortably within **16G VRAM consumer devices**. It excels in photorealistic image generation, bilingual text rendering (English & Chinese), and robust instruction adherence.
37
 
38
- - 🎨 **Z-Image** – The foundation model behind Z-Image-Turbo. Z-Image focuses on **high-quality generation**, **rich aesthetics**, **strong diversity**, and **controllability**, well-suited for creative generation, **fine-tuning**, and downstream development. It supports a wide range of artistic styles, effective negative prompting, and high diversity across identities, poses, compositions, and layouts.
39
-
40
- - 🧱 **Z-Image-Omni-Base** – The versatile foundation model capable of both **generation and editing tasks**. By releasing this checkpoint, we aim to unlock the full potential for community-driven fine-tuning and custom development, providing the most "raw" and diverse starting point for the open-source community.
41
 
42
  - ✍️ **Z-Image-Edit** – A variant fine-tuned on Z-Image specifically for image editing tasks. It supports creative image-to-image generation with impressive instruction-following capabilities, allowing for precise edits based on natural language prompts.
43
 
44
  ### 📥 Model Zoo
45
 
46
- | Model | Pre-Training | SFT | RL | Step | CFG | Task | Visual Quality | Diversity | Fine-Tunability | Hugging Face | ModelScope |
47
- | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
48
- | **Z-Image-Omni-Base** | | | | 50 | | Gen. / Editing | Medium | High | Easy | *To be released* | *To be released* |
49
- | **Z-Image** | | | | 50 | ✅ | Gen. | High | Medium | Easy | [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Checkpoint%20-Z--Image-yellow)](https://huggingface.co/Tongyi-MAI/Z-Image) <br> [![Hugging Face Space](https://img.shields.io/badge/%F0%9F%A4%97%20Demo-Z--Image-blue)](https://huggingface.co/spaces/Tongyi-MAI/Z-Image) | [![ModelScope Model](https://img.shields.io/badge/🤖%20%20Checkpoint-Z--Image-624aff)](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image) <br> [![ModelScope Space](https://img.shields.io/badge/%F0%9F%A4%96%20Demo-Z--Image-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=569345&modelType=Checkpoint&sdVersion=Z_IMAGE&modelUrl=modelscope%3A%2F%2FTongyi-MAI%2FZ-Image%3Frevision%3Dmaster) |
50
- | **Z-Image-Turbo** | | | | 8 | ❌ | Gen. | Very High | Low | N/A | [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Checkpoint%20-Z--Image--Turbo-yellow)](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) <br> [![Hugging Face Space](https://img.shields.io/badge/%F0%9F%A4%97%20Demo-Z--Image--Turbo-blue)](https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo) | [![ModelScope Model](https://img.shields.io/badge/🤖%20%20Checkpoint-Z--Image--Turbo-624aff)](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo) <br> [![ModelScope Space](https://img.shields.io/badge/%F0%9F%A4%96%20Demo-Z--Image--Turbo-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=469191&modelType=Checkpoint&sdVersion=Z_IMAGE_TURBO&modelUrl=modelscope%3A%2F%2FTongyi-MAI%2FZ-Image-Turbo%3Frevision%3Dmaster) |
51
- | **Z-Image-Edit** | ✅ | ✅ | ❌ | 50 | ✅ | Editing | High | Medium | Easy | *To be released* | *To be released* | | *To be released* |
52
 
53
  ### 🖼️ Showcase
54
 
@@ -74,11 +70,11 @@ We adopt a **Scalable Single-Stream DiT** (S3-DiT) architecture. In this setup,
74
  ![Architecture of Z-Image and Z-Image-Edit](assets/architecture.webp)
75
 
76
  ### 📈 Performance
77
- According to the Elo-based Human Preference Evaluation (on [*Alibaba AI Arena*](https://aiarena.alibaba-inc.com/corpora/arena/leaderboard?arenaType=T2I)), Z-Image-Turbo shows highly competitive performance against other leading models, while achieving state-of-the-art results among open-source models.
78
 
79
  <p align="center">
80
  <a href="https://aiarena.alibaba-inc.com/corpora/arena/leaderboard?arenaType=T2I">
81
- <img src="assets/leaderboard.png" alt="Z-Image Elo Rating on AI Arena"/><br />
82
  <span style="font-size:1.05em; cursor:pointer; text-decoration:underline;"> Click to view the full leaderboard</span>
83
  </a>
84
  </p>
@@ -88,7 +84,7 @@ Install the latest version of diffusers, use the following command:
88
  <details>
89
  <summary><sup>Click here for details for why you need to install diffusers from source</sup></summary>
90
 
91
- We have submitted two pull requests ([#12703](https://github.com/huggingface/diffusers/pull/12703) and [#12715](https://github.com/huggingface/diffusers/pull/12715)) to the 🤗 diffusers repository to add support for Z-Image. Both PRs have been merged into the latest official diffusers release.
92
  Therefore, you need to install diffusers from source for the latest features and Z-Image support.
93
 
94
  </details>
@@ -99,7 +95,7 @@ pip install git+https://github.com/huggingface/diffusers
99
 
100
  ```python
101
  import torch
102
- from diffusers import ZImagePipeline
103
 
104
  # 1. Load the pipeline
105
  # Use bfloat16 for optimal performance on supported GPUs
@@ -140,8 +136,6 @@ image.save("example.png")
140
 
141
  ## 🔬 Decoupled-DMD: The Acceleration Magic Behind Z-Image
142
 
143
- [![arXiv](https://img.shields.io/badge/arXiv-2511.22677-b31b1b.svg)](https://arxiv.org/abs/2511.22677)
144
-
145
  Decoupled-DMD is the core few-step distillation algorithm that empowers the 8-step Z-Image model.
146
 
147
  Our core insight in Decoupled-DMD is that the success of existing DMD (Distributaion Matching Distillation) methods is the result of two independent, collaborating mechanisms:
@@ -177,24 +171,12 @@ HF_XET_HIGH_PERFORMANCE=1 hf download Tongyi-MAI/Z-Image-Turbo
177
  If you find our work useful in your research, please consider citing:
178
 
179
  ```bibtex
180
- @article{team2025zimage,
181
  title={Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer},
182
- author={Z-Image Team},
183
- journal={arXiv preprint arXiv:2511.22699},
184
- year={2025}
185
- }
186
-
187
- @article{liu2025decoupled,
188
- title={Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield},
189
- author={Dongyang Liu and Peng Gao and David Liu and Ruoyi Du and Zhen Li and Qilong Wu and Xin Jin and Sihan Cao and Shifeng Zhang and Hongsheng Li and Steven Hoi},
190
- journal={arXiv preprint arXiv:2511.22677},
191
- year={2025}
192
- }
193
-
194
- @article{jiang2025distribution,
195
- title={Distribution Matching Distillation Meets Reinforcement Learning},
196
- author={Jiang, Dengyang and Liu, Dongyang and Wang, Zanyi and Wu, Qilong and Jin, Xin and Liu, David and Li, Zhen and Wang, Mengmeng and Gao, Peng and Yang, Harry},
197
- journal={arXiv preprint arXiv:2511.13649},
198
- year={2025}
199
  }
200
  ```
 
11
 
12
  <div align="center">
13
 
14
+ [![Official Site](https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage)](https://tongyi-mai.github.io/Z-Image-homepage/)&#160;
15
  [![GitHub](https://img.shields.io/badge/GitHub-Z--Image-181717?logo=github&logoColor=white)](https://github.com/Tongyi-MAI/Z-Image)&#160;
16
  [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Checkpoint-Z--Image--Turbo-yellow)](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo)&#160;
17
  [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Online_Demo-Z--Image--Turbo-blue)](https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo)&#160;
 
18
  [![ModelScope Model](https://img.shields.io/badge/🤖%20Checkpoint-Z--Image--Turbo-624aff)](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo)&#160;
19
+ [![ModelScope Space](https://img.shields.io/badge/🤖%20Online_Demo-Z--Image--Turbo-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=469191&modelType=Checkpoint&sdVersion=Z_IMAGE_TURBO&modelUrl=modelscope%253A%252F%252FTongyi-MAI%252FZ-Image-Turbo%253Frevision%253Dmaster%7D%7BOnline)&#160;
20
  [![Art Gallery PDF](https://img.shields.io/badge/%F0%9F%96%BC%20Art_Gallery-PDF-ff69b4)](assets/Z-Image-Gallery.pdf)&#160;
21
  [![Web Art Gallery](https://img.shields.io/badge/%F0%9F%8C%90%20Web_Art_Gallery-online-00bfff)](https://modelscope.cn/studios/Tongyi-MAI/Z-Image-Gallery/summary)&#160;
22
+ <a href="http://github.com/Tongyi-MAI/Z-Image/blob/main/Z_Image_Report.pdf" target="_blank"><img src="https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv" height="21px"></a>
23
 
24
 
25
  Welcome to the official repository for the Z-Image(造相)project!
 
30
 
31
  ## ✨ Z-Image
32
 
33
+ Z-Image is a powerful and highly efficient image generation model with **6B** parameters. It is currently has three variants:
34
 
35
  - 🚀 **Z-Image-Turbo** – A distilled version of Z-Image that matches or exceeds leading competitors with only **8 NFEs** (Number of Function Evaluations). It offers **⚡️sub-second inference latency⚡️** on enterprise-grade H800 GPUs and fits comfortably within **16G VRAM consumer devices**. It excels in photorealistic image generation, bilingual text rendering (English & Chinese), and robust instruction adherence.
36
 
37
+ - 🧱 **Z-Image-Base** – The non-distilled foundation model. By releasing this checkpoint, we aim to unlock the full potential for community-driven fine-tuning and custom development.
 
 
38
 
39
  - ✍️ **Z-Image-Edit** – A variant fine-tuned on Z-Image specifically for image editing tasks. It supports creative image-to-image generation with impressive instruction-following capabilities, allowing for precise edits based on natural language prompts.
40
 
41
  ### 📥 Model Zoo
42
 
43
+ | Model | Hugging Face | ModelScope |
44
+ | :--- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
45
+ | **Z-Image-Turbo** | [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Checkpoint%20-Z--Image--Turbo-yellow)](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) <br> [![Hugging Face Space](https://img.shields.io/badge/%F0%9F%A4%97%20Online%20Demo-Z--Image--Turbo-blue)](https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo) | [![ModelScope Model](https://img.shields.io/badge/🤖%20%20Checkpoint-Z--Image--Turbo-624aff)](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo) <br> [![ModelScope Space](https://img.shields.io/badge/%F0%9F%A4%96%20Online%20Demo-Z--Image--Turbo-17c7a7)](https://www.modelscope.cn/aigc/imageGeneration?tab=advanced&versionId=469191&modelType=Checkpoint&sdVersion=Z_IMAGE_TURBO&modelUrl=modelscope%3A%2F%2FTongyi-MAI%2FZ-Image-Turbo%3Frevision%3Dmaster) |
46
+ | **Z-Image-Base** | *To be released* | *To be released* |
47
+ | **Z-Image-Edit** | *To be released* | *To be released* |
 
48
 
49
  ### 🖼️ Showcase
50
 
 
70
  ![Architecture of Z-Image and Z-Image-Edit](assets/architecture.webp)
71
 
72
  ### 📈 Performance
73
+ According to the Elo-based Human Preference Evaluation (on [AI Arena](https://aiarena.alibaba-inc.com/corpora/arena/leaderboard?arenaType=T2I)), Z-Image-Turbo shows highly competitive performance against other leading models, while achieving state-of-the-art results among open-source models.
74
 
75
  <p align="center">
76
  <a href="https://aiarena.alibaba-inc.com/corpora/arena/leaderboard?arenaType=T2I">
77
+ <img src="assets/leaderboard.webp" alt="Z-Image Elo Rating on AI Arena"/><br />
78
  <span style="font-size:1.05em; cursor:pointer; text-decoration:underline;"> Click to view the full leaderboard</span>
79
  </a>
80
  </p>
 
84
  <details>
85
  <summary><sup>Click here for details for why you need to install diffusers from source</sup></summary>
86
 
87
+ We have submitted two pull requests ([#12703](https://github.com/huggingface/diffusers/pull/12703) and [#12715](https://github.com/huggingface/diffusers/pull/12704)) to the 🤗 diffusers repository to add support for Z-Image. Both PRs have been merged into the latest official diffusers release.
88
  Therefore, you need to install diffusers from source for the latest features and Z-Image support.
89
 
90
  </details>
 
95
 
96
  ```python
97
  import torch
98
+ from diffusers import ZImagePipeline,
99
 
100
  # 1. Load the pipeline
101
  # Use bfloat16 for optimal performance on supported GPUs
 
136
 
137
  ## 🔬 Decoupled-DMD: The Acceleration Magic Behind Z-Image
138
 
 
 
139
  Decoupled-DMD is the core few-step distillation algorithm that empowers the 8-step Z-Image model.
140
 
141
  Our core insight in Decoupled-DMD is that the success of existing DMD (Distributaion Matching Distillation) methods is the result of two independent, collaborating mechanisms:
 
171
  If you find our work useful in your research, please consider citing:
172
 
173
  ```bibtex
174
+ @misc{z-image-2025,
175
  title={Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer},
176
+ author={Tongyi Lab},
177
+ year={2025},
178
+ publisher={GitHub},
179
+ journal={GitHub repository},
180
+ howpublished={\url{https://github.com/Tongyi-MAI/Z-Image}}
 
 
 
 
 
 
 
 
 
 
 
 
181
  }
182
  ```
assets/leaderboard.png DELETED

Git LFS Details

  • SHA256: e9fd4aa185bb7bff2b5515f2001b4d80df330595e78d6a098142e5a232bb4e4e
  • Pointer size: 132 Bytes
  • Size of remote file: 2.03 MB
assets/showcase_realistic.png CHANGED

Git LFS Details

  • SHA256: 697e6f6857f619314173508df72a14314cbb43e67475de7494123bb8b4f4eb2c
  • Pointer size: 132 Bytes
  • Size of remote file: 6.26 MB

Git LFS Details

  • SHA256: 9a739bf5b0d1055e8fbe073b560fade2cc7bbcf4a0c8e90daf039cea051bb84b
  • Pointer size: 132 Bytes
  • Size of remote file: 8.3 MB