File size: 7,013 Bytes
5514a5b 541d0c3 5514a5b aaaf4d2 ab5c43e 5514a5b 537d519 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 |
---
license: apache-2.0
tags:
- diffusion-single-file
- comfyui
- distillation
- LoRA
- video
- video genration
base_model:
- Wan-AI/Wan2.2-I2V-A14B
pipeline_tags:
- image-to-video
- text-to-video
library_name: diffusers
---
# π¬ Wan2.2 Distilled Models
### β‘ High-Performance Video Generation with 4-Step Inference
*Distillation-accelerated version of Wan2.2 - Dramatically faster speed with excellent quality*

---
[](https://huggingface.co/lightx2v/Wan2.2-Distill-Models)
[](https://github.com/ModelTC/LightX2V)
[](LICENSE)
---
## π What's Special?
<table>
<tr>
<td width="50%">
### β‘ Ultra-Fast Generation
- **4-step inference** (vs traditional 50+ steps)
- Approximately **2x faster** using LightX2V than ComfyUI
- Near real-time video generation capability
</td>
<td width="50%">
### π― Flexible Options
- **Dual noise control**: High/Low noise variants
- Multiple precision formats (BF16/FP8/INT8)
- Full 14B parameter models
</td>
</tr>
<tr>
<td width="50%">
### πΎ Memory Efficient
- FP8/INT8: **~50% size reduction**
- CPU offload support
- Optimized for consumer GPUs
</td>
<td width="50%">
### π§ Easy Integration
- Compatible with LightX2V framework
- ComfyUI support
- Simple configuration files
</td>
</tr>
</table>
---
## π¦ Model Catalog
### π₯ Model Types
<table>
<tr>
<td align="center" width="50%">
#### πΌοΈ **Image-to-Video (I2V) - 14B Parameters**
Transform static images into dynamic videos with advanced quality control
- π¨ **High Noise**: More creative, diverse outputs
- π― **Low Noise**: More faithful to input, stable outputs
</td>
<td align="center" width="50%">
#### π **Text-to-Video (T2V) - 14B Parameters**
Generate videos from text descriptions
- π¨ **High Noise**: More creative, diverse outputs
- π― **Low Noise**: More stable and controllable outputs
- π Full 14B parameter model
</td>
</tr>
</table>
### π― Precision Versions
| Precision | Model Identifier | Model Size | Framework | Quality vs Speed |
|:---------:|:-----------------|:----------:|:---------:|:-----------------|
| π **BF16** | `lightx2v_4step` | ~28.6 GB | LightX2V | βββββ Highest Quality |
| β‘ **FP8** | `scaled_fp8_e4m3_lightx2v_4step` | ~15 GB | LightX2V | ββββ Excellent Balance |
| π― **INT8** | `int8_lightx2v_4step` | ~15 GB | LightX2V | ββββ Fast & Efficient |
| π· **FP8 ComfyUI** | `scaled_fp8_e4m3_lightx2v_4step_comfyui` | ~15 GB | ComfyUI | βββ ComfyUI Ready |
### π Naming Convention
```bash
# Format: wan2.2_{task}_A14b_{noise_level}_{precision}_lightx2v_4step.safetensors
# I2V Examples:
wan2.2_i2v_A14b_high_noise_lightx2v_4step.safetensors # I2V High Noise - BF16
wan2.2_i2v_A14b_high_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors # I2V High Noise - FP8
wan2.2_i2v_A14b_low_noise_int8_lightx2v_4step.safetensors # I2V Low Noise - INT8
wan2.2_i2v_A14b_low_noise_scaled_fp8_e4m3_lightx2v_4step_comfyui.safetensors # I2V Low Noise - FP8 ComfyUI
```
> π‘ **Browse All Models**: [View Full Model Collection β](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/tree/main)
---
## π Usage
### Method 1: LightX2V (Recommended β)
**LightX2V is a high-performance inference framework optimized for these models, approximately 2x faster than ComfyUI with better quantization accuracy. Highly recommended!**
#### Quick Start
1. Download model (using I2V FP8 as example)
```bash
huggingface-cli download lightx2v/Wan2.2-Distill-Models \
--local-dir ./models/wan2.2_i2v \
--include "wan2.2_i2v_A14b_high_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors"
```
```bash
huggingface-cli download lightx2v/Wan2.2-Distill-Models \
--local-dir ./models/wan2.2_i2v \
--include "wan2.2_i2v_A14b_low_noise_scaled_fp8_e4m3_lightx2v_4step.safetensors"
```
> π‘ **Tip**: For T2V models, follow the same steps but replace `i2v` with `t2v` in the filenames
2. Clone LightX2V repository
```bash
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
```
3. Install dependencies
```bash
pip install -r requirements.txt
```
Or refer to [Quick Start Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/quickstart.html) to use docker
4. Select and modify configuration file
Choose appropriate configuration based on your GPU memory:
**80GB+ GPUs (A100/H100)**
- I2V: [wan_moe_i2v_distill.json](https://github.com/ModelTC/LightX2V/blob/main/configs/wan22/wan_moe_i2v_distill.json)
**24GB+ GPUs (RTX 4090)**
- I2V: [wan_moe_i2v_distill_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/wan22/wan_moe_i2v_distill_4090.json)
5. Run inference (using [I2V]((https://github.com/ModelTC/LightX2V/blob/main/scripts/wan22/run_wan22_moe_i2v_distill.sh)) as example)
```bash
cd scripts
bash wan22/run_wan22_moe_i2v_distill.sh
```
> π **Note**: Update model paths in the script to point to your Wan2.2 model. Also refer to [LightX2V Model Structure Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/model_structure.html)
#### LightX2V Documentation
- **Quick Start Guide**: [LightX2V Quick Start](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/quickstart.html)
- **Complete Usage Guide**: [LightX2V Model Structure Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/model_structure.html)
- **Configuration File Instructions**: [Configuration Files](https://github.com/ModelTC/LightX2V/tree/main/configs/distill)
- **Quantized Model Usage**: [Quantization Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/method_tutorials/quantization.html)
- **Parameter Offloading**: [Offload Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/method_tutorials/offload.html)
---
### Method 2: ComfyUI
Please refer to [workflow](https://huggingface.co/lightx2v/Wan2.2-Distill-Models/blob/main/wan2.2_moe_i2v_scale_fp8_comfyui.json)
## β οΈ Important Notes
**Other Components**: These models only contain DIT weights. Additional components needed at runtime:
- T5 text encoder
- CLIP vision encoder
- VAE encoder/decoder
- Tokenizer
Please refer to [LightX2V Documentation](https://lightx2v-zhcn.readthedocs.io/zh-cn/latest/getting_started/model_structure.html) for instructions on organizing the complete model directory.
## π€ Community
- **GitHub Issues**: https://github.com/ModelTC/LightX2V/issues
- **HuggingFace**: https://huggingface.co/lightx2v/Wan2.2-Distill-Models
If you find this project helpful, please give us a β on [GitHub](https://github.com/ModelTC/LightX2V)
</div>
|