Add SGMD paper info and improve model card metadata

#20
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +58 -101
README.md CHANGED
@@ -1,20 +1,22 @@
1
  ---
2
- license: apache-2.0
3
- tags:
4
- - diffusion-single-file
5
- - comfyui
6
- - distillation
7
- - LoRA
8
- - video
9
- - video genration
10
  base_model:
11
- - Wan-AI/Wan2.2-I2V-A14B
12
- pipeline_tags:
13
- - image-to-video
14
- - text-to-video
15
  library_name: diffusers
 
 
 
 
 
 
 
 
 
 
16
  ---
17
- # 🎬 Wan2.2 Distilled Models
 
 
 
18
 
19
  ### ⚡ High-Performance Video Generation with 4-Step Inference
20
 
@@ -34,6 +36,45 @@ library_name: diffusers
34
 
35
  - 2026.04.12: We are excited to release the [Wan2.2-I2V-A14B-4step-720p-high](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_i2v_A14b_high_noise_lightx2v_4step_720p_260412.safetensors) and [Wan2.2-I2V-A14B-4step-720p-low](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_i2v_A14b_low_noise_lightx2v_4step_720p_260412.safetensors) models. Compared to previous iterations, this version was trained on a high-quality 720p dataset and features an optimized low-noise training algorithm. These enhancements significantly boost the model's performance in fine-grained detail rendering and visual texture.
36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ## 🌟 What's Special?
39
 
@@ -115,94 +156,13 @@ Generate videos from text descriptions
115
  | 🎯 **INT8** | `int8_lightx2v_4step` | ~15 GB | LightX2V | ⭐⭐⭐⭐ Fast & Efficient |
116
  | 🔷 **FP8 ComfyUI** | `scaled_fp8_e4m3_lightx2v_4step_comfyui` | ~15 GB | ComfyUI | ⭐⭐⭐ ComfyUI Ready |
117
 
118
- ### 📝 Naming Convention
119
-
120
- ```bash
121
- # Format: wan2.2_{task}_A14b_{noise_level}_{precision}_lightx2v_4step.safetensors
122
-
123
- # I2V Examples:
124
- wan2.2_i2v_A14b_high_noise_lightx2v_4step.safetensors # I2V High Noise - BF16
125
- wan2.2_i2v_A14b_high_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors # I2V High Noise - FP8
126
- wan2.2_i2v_A14b_low_noise_int8_lightx2v_4step.safetensors # I2V Low Noise - INT8
127
- wan2.2_i2v_A14b_low_noise_scaled_fp8_e4m3_lightx2v_4step_comfyui.safetensors # I2V Low Noise - FP8 ComfyUI
128
-
129
- ```
130
- > 💡 **Browse All Models**: [View Full Model Collection →](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/tree/main)
131
-
132
  ---
133
 
134
- ## 🚀 Usage
135
-
136
- ### Method 1: LightX2V (Recommended ⭐)
137
-
138
- **LightX2V is a high-performance inference framework optimized for these models, approximately 2x faster than ComfyUI with better quantization accuracy. Highly recommended!**
139
-
140
- #### Quick Start
141
-
142
- 1. Download model (using I2V FP8 as example)
143
- ```bash
144
- huggingface-cli download lightx2v/Wan2.2-Distill-Models \
145
- --local-dir ./models/wan2.2_i2v \
146
- --include "wan2.2_i2v_A14b_high_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors"
147
- ```
148
-
149
- ```bash
150
- huggingface-cli download lightx2v/Wan2.2-Distill-Models \
151
- --local-dir ./models/wan2.2_i2v \
152
- --include "wan2.2_i2v_A14b_low_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors"
153
- ```
154
-
155
- > 💡 **Tip**: For T2V models, follow the same steps but replace `i2v` with `t2v` in the filenames
156
-
157
- 2. Clone LightX2V repository
158
-
159
- ```bash
160
- git clone https://github.com/ModelTC/LightX2V.git
161
- cd LightX2V
162
- ```
163
-
164
- 3. Install dependencies
165
-
166
- ```bash
167
- pip install -r requirements.txt
168
- ```
169
- Or refer to [Quick Start Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/quickstart.html) to use docker
170
-
171
- 4. Select and modify configuration file
172
-
173
- Choose appropriate configuration based on your GPU memory:
174
-
175
- **80GB+ GPUs (A100/H100)**
176
- - I2V: [wan_moe_i2v_distill.json](https://github.com/ModelTC/LightX2V/blob/main/configs/wan22/wan_moe_i2v_distill.json)
177
-
178
- **24GB+ GPUs (RTX 4090)**
179
- - I2V: [wan_moe_i2v_distill_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/wan22/wan_moe_i2v_distill_4090.json)
180
-
181
-
182
- 5. Run inference (using [I2V]((https://github.com/ModelTC/LightX2V/blob/main/scripts/wan22/run_wan22_moe_i2v_distill.sh)) as example)
183
- ```bash
184
- cd scripts
185
- bash wan22/run_wan22_moe_i2v_distill.sh
186
- ```
187
-
188
- > 📝 **Note**: Update model paths in the script to point to your Wan2.2 model. Also refer to [LightX2V Model Structure Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/model_structure.html)
189
-
190
-
191
- #### LightX2V Documentation
192
- - **Quick Start Guide**: [LightX2V Quick Start](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/quickstart.html)
193
- - **Complete Usage Guide**: [LightX2V Model Structure Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/model_structure.html)
194
- - **Configuration File Instructions**: [Configuration Files](https://github.com/ModelTC/LightX2V/tree/main/configs/distill)
195
- - **Quantized Model Usage**: [Quantization Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/method_tutorials/quantization.html)
196
- - **Parameter Offloading**: [Offload Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/method_tutorials/offload.html)
197
-
198
- ---
199
-
200
- ### Method 2: ComfyUI
201
 
 
202
  Please refer to [workflow](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_moe_i2v_scale_fp8_comfyui.json)
203
 
204
-
205
-
206
  ## ⚠️ Important Notes
207
 
208
  **Other Components**: These models only contain DIT weights. Additional components needed at runtime:
@@ -211,14 +171,11 @@ Please refer to [workflow](https://huggingface.co/lightx2v/Wan2.2-Distill-Models
211
  - VAE encoder/decoder
212
  - Tokenizer
213
 
214
- Please refer to [LightX2V Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/model_structure.html) for instructions on organizing the complete model directory.
215
-
216
 
217
  ## 🤝 Community
218
 
219
  - **GitHub Issues**: https://github.com/ModelTC/LightX2V/issues
220
  - **HuggingFace**: https://huggingface.co/lightx2v/Wan2.2-Distill-Models
221
 
222
- If you find this project helpful, please give us a ⭐ on [GitHub](https://github.com/ModelTC/LightX2V)
223
-
224
- </div>
 
1
  ---
 
 
 
 
 
 
 
 
2
  base_model:
3
+ - Wan-AI/Wan2.2-I2V-A14B
 
 
 
4
  library_name: diffusers
5
+ license: apache-2.0
6
+ tags:
7
+ - diffusion-single-file
8
+ - comfyui
9
+ - distillation
10
+ - LoRA
11
+ - video
12
+ - video generation
13
+ - SGMD
14
+ pipeline_tag: image-to-video
15
  ---
16
+
17
+ # 🎬 Wan2.2 Distilled Models (SGMD)
18
+
19
+ This repository contains distilled versions of the Wan2.2 models using **SGMD (Score Gradient Matching Distillation)**, as presented in the paper [SGMD: Score Gradient Matching Distillation for Few-Step Video Diffusion Distillation](https://huggingface.co/papers/2605.30116).
20
 
21
  ### ⚡ High-Performance Video Generation with 4-Step Inference
22
 
 
36
 
37
  - 2026.04.12: We are excited to release the [Wan2.2-I2V-A14B-4step-720p-high](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_i2v_A14b_high_noise_lightx2v_4step_720p_260412.safetensors) and [Wan2.2-I2V-A14B-4step-720p-low](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_i2v_A14b_low_noise_lightx2v_4step_720p_260412.safetensors) models. Compared to previous iterations, this version was trained on a high-quality 720p dataset and features an optimized low-noise training algorithm. These enhancements significantly boost the model's performance in fine-grained detail rendering and visual texture.
38
 
39
+ ## 🚀 Quick Usage (Python)
40
+
41
+ To use these models with the [LightX2V](https://github.com/ModelTC/LightX2V) framework for 4-step inference:
42
+
43
+ ```python
44
+ from lightx2v import LightX2VPipeline
45
+
46
+ # Initialize pipeline for Wan2.2 I2V task
47
+ pipe = LightX2VPipeline(
48
+ model_path="lightx2v/Wan2.2-Distill-Models",
49
+ model_cls="wan2.2_moe",
50
+ task="i2v",
51
+ )
52
+
53
+ # Enable offloading to reduce VRAM usage
54
+ pipe.enable_offload(
55
+ cpu_offload=True,
56
+ offload_granularity="block",
57
+ text_encoder_offload=True,
58
+ )
59
+
60
+ # Create generator for 4-step inference
61
+ pipe.create_generator(
62
+ attn_mode="sage_attn2",
63
+ infer_steps=4,
64
+ height=480,
65
+ width=832,
66
+ num_frames=81,
67
+ guidance_scale=[1.0, 1.0],
68
+ )
69
+
70
+ # Generate video
71
+ pipe.generate(
72
+ seed=42,
73
+ image_path="path/to/your/image.jpg",
74
+ prompt="A cinematic shot of a sunset over the ocean",
75
+ save_result_path="output.mp4",
76
+ )
77
+ ```
78
 
79
  ## 🌟 What's Special?
80
 
 
156
  | 🎯 **INT8** | `int8_lightx2v_4step` | ~15 GB | LightX2V | ⭐⭐⭐⭐ Fast & Efficient |
157
  | 🔷 **FP8 ComfyUI** | `scaled_fp8_e4m3_lightx2v_4step_comfyui` | ~15 GB | ComfyUI | ⭐⭐⭐ ComfyUI Ready |
158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
  ---
160
 
161
+ ## 🚀 Alternative Usage Methods
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
162
 
163
+ ### Method 1: ComfyUI
164
  Please refer to [workflow](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_moe_i2v_scale_fp8_comfyui.json)
165
 
 
 
166
  ## ⚠️ Important Notes
167
 
168
  **Other Components**: These models only contain DIT weights. Additional components needed at runtime:
 
171
  - VAE encoder/decoder
172
  - Tokenizer
173
 
174
+ Please refer to [LightX2V Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/model_structure.html) for instructions on organizing the complete model directory.
 
175
 
176
  ## 🤝 Community
177
 
178
  - **GitHub Issues**: https://github.com/ModelTC/LightX2V/issues
179
  - **HuggingFace**: https://huggingface.co/lightx2v/Wan2.2-Distill-Models
180
 
181
+ If you find this project helpful, please give us a ⭐ on [GitHub](https://github.com/ModelTC/LightX2V)