lightx2v commited on
Commit
57ff06c
·
verified ·
1 Parent(s): bc6fac7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -3
README.md CHANGED
@@ -1,3 +1,122 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - diffusion-single-file
5
+ - comfyui
6
+ - distillation
7
+ - Z-Image-Turbo
8
+ base_model:
9
+ - Tongyi-MAI/Z-Image-Turbo
10
+ pipeline_tags:
11
+ - text-to-image
12
+ library_name: diffusers
13
+ pipeline_tag: text-to-image
14
+ ---
15
+ # Z-Image-Turbo-Quantized
16
+
17
+ Quantized weights for [Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) optimized for **8GB VRAM GPUs**.
18
+
19
+ ## 📦 Available Models
20
+
21
+ - **`z_image_turbo_scaled_fp8_e4m3fn.safetensors`** (6.17 GB) - FP8 E4M3FN quantized weights
22
+ - **`z_image_turbo_int8.safetensors`** (6.17 GB) - INT8 quantized weights
23
+
24
+ ## 🚀 Installation
25
+
26
+ ```bash
27
+ git clone https://github.com/ModelTC/LightX2V.git
28
+ cd LightX2V
29
+ pip install .
30
+ ```
31
+
32
+ ## 💻 Usage for 8GB VRAM GPUs
33
+
34
+ To run Z-Image-Turbo on 8GB VRAM GPUs, you need to:
35
+ 1. Use quantized transformer weights (FP8 or INT8)
36
+ 2. Use int4 quantized Qwen3 text encoder
37
+ 3. Enable CPU offloading
38
+
39
+ ### Complete Example
40
+
41
+ ```python
42
+ from lightx2v import LightX2VPipeline
43
+
44
+ # Initialize pipeline
45
+ pipe = LightX2VPipeline(
46
+ model_path="Tongyi-MAI/Z-Image-Turbo",
47
+ model_cls="z_image",
48
+ task="t2i",
49
+ )
50
+
51
+ # Step 1: Enable quantization (FP8 transformer + INT4 text encoder)
52
+ pipe.enable_quantize(
53
+ dit_quantized=True,
54
+ dit_quantized_ckpt="lightx2v/Z-Image-Turbo-Quantized/z_image_turbo_scaled_fp8_e4m3fn.safetensors",
55
+ quant_scheme="fp8-sgl",
56
+ # IMPORTANT: Use int4 Qwen3 for 8GB VRAM
57
+ text_encoder_quantized=True,
58
+ text_encoder_quantized_ckpt="JunHowie/Qwen3-4B-GPTQ-Int4",
59
+ text_encoder_quant_scheme="int4"
60
+ )
61
+
62
+ # Step 2: Enable CPU offloading
63
+ pipe.enable_offload(
64
+ cpu_offload=True,
65
+ offload_granularity="model", # Use "model" for maximum memory savings
66
+ )
67
+
68
+ # Step 3: Create generator
69
+ pipe.create_generator(
70
+ attn_mode="flash_attn3",
71
+ aspect_ratio="16:9",
72
+ infer_steps=9,
73
+ guidance_scale=1,
74
+ )
75
+
76
+ # Step 4: Generate image
77
+ pipe.generate(
78
+ seed=42,
79
+ prompt="A beautiful landscape with mountains and lakes, ultra HD, 4K",
80
+ negative_prompt="",
81
+ save_result_path="output.png",
82
+ )
83
+ ```
84
+
85
+ ## ⚙️ Configuration Options
86
+
87
+ ### Quantization Schemes
88
+
89
+ **FP8 (Recommended)** - Better quality and speed:
90
+ ```python
91
+ dit_quantized_ckpt="lightx2v/Z-Image-Turbo-Quantized/z_image_turbo_scaled_fp8_e4m3fn.safetensors",
92
+ quant_scheme="fp8-sgl",
93
+ ```
94
+
95
+ **INT8** - Alternative option:
96
+ ```python
97
+ dit_quantized_ckpt="lightx2v/Z-Image-Turbo-Quantized/z_image_turbo_int8.safetensors",
98
+ quant_scheme="int8-sgl",
99
+ ```
100
+
101
+ ### Offload Granularity
102
+
103
+ - **`"model"`** (Recommended for 8GB): Offload entire model to CPU, load to GPU only during inference. Maximum memory savings.
104
+ - **`"block"`**: Offload individual transformer blocks. More fine-grained control.
105
+
106
+ ## ⚠️ Important Notes
107
+
108
+ 1. **Order matters**: All `enable_quantize()` and `enable_offload()` calls must be made **before** `create_generator()`, otherwise they will not take effect.
109
+
110
+ 2. **Text encoder quantization**: Using int4 Qwen3 text encoder is **highly recommended** for 8GB VRAM GPUs to ensure stable operation.
111
+
112
+ 3. **Memory optimization**: The combination of FP8/INT8 transformer + int4 Qwen3 + model-level offloading is optimized for 8GB VRAM.
113
+
114
+ ## 📚 References
115
+
116
+ - Original Model: [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo)
117
+ - LightX2V: [GitHub](https://github.com/ModelTC/LightX2V)
118
+ - Qwen3-4B-GPTQ-Int4: [JunHowie/Qwen3-4B-GPTQ-Int4](https://huggingface.co/JunHowie/Qwen3-4B-GPTQ-Int4)
119
+
120
+ ## 📄 License
121
+
122
+ Apache 2.0