File size: 6,382 Bytes
860741a e13d47b 860741a e13d47b 31918f7 860741a a36df1d 7214d23 a36df1d 7214d23 a36df1d 7214d23 917b34a 7214d23 fbd0c0d b010798 a36df1d 7214d23 a36df1d 7214d23 a36df1d 7214d23 a36df1d 7214d23 a36df1d 7214d23 a36df1d 7214d23 a36df1d b010798 a36df1d 7214d23 a36df1d 7214d23 a36df1d 7214d23 6197166 7214d23 d811ce3 7214d23 6197166 7214d23 6197166 7214d23 6197166 7214d23 83cc586 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
---
license: apache-2.0
tags:
- diffusion-single-file
- comfyui
- distillation
- lora
- video
- video genration
base_model:
- Wan-AI/Wan2.1-T2V-14B
- Wan-AI/Wan2.1-I2V-14B-480P
- Wan-AI/Wan2.1-I2V-14B-720P
library_name: diffusers
---
<div align="center">
# π¬ Wan2.1 Distilled Models
### β‘ High-Performance Video Generation with 4-Step Inference
*Distillation-accelerated versions of Wan2.1 - Dramatically faster while maintaining exceptional quality*

---
[](https://huggingface.co/lightx2v/Wan2.1-Distill-Models)
[](https://github.com/ModelTC/LightX2V)
[](LICENSE)
</div>
---
## π What's Special?
<table>
<tr>
<td width="50%">
### β‘ Ultra-Fast Generation
- **4-step inference** (vs traditional 50+ steps)
- Up to **2x faster** than ComfyUI
- Real-time video generation capability
</td>
<td width="50%">
### π― Flexible Options
- Multiple resolutions (480P/720P)
- Various precision formats (BF16/FP8/INT8)
- I2V and T2V support
</td>
</tr>
<tr>
<td width="50%">
### πΎ Memory Efficient
- FP8/INT8: **~50% size reduction**
- CPU offload support
- Optimized for consumer GPUs
</td>
<td width="50%">
### π§ Easy Integration
- Compatible with LightX2V framework
- ComfyUI support available
- Simple configuration files
</td>
</tr>
</table>
---
## π¦ Model Catalog
### π₯ Model Types
<table>
<tr>
<td align="center" width="50%">
#### πΌοΈ **Image-to-Video (I2V)**
Transform still images into dynamic videos
- πΊ 480P Resolution
- π¬ 720P Resolution
</td>
<td align="center" width="50%">
#### π **Text-to-Video (T2V)**
Generate videos from text descriptions
- π 14B Parameters
- π¨ High-quality synthesis
</td>
</tr>
</table>
### π― Precision Variants
| Precision | Model Identifier | Model Size | Framework | Quality vs Speed |
|:---------:|:-----------------|:----------:|:---------:|:-----------------|
| π **BF16** | `lightx2v_4step` | ~28-32 GB | LightX2V | βββββ Highest quality |
| β‘ **FP8** | `scaled_fp8_e4m3_lightx2v_4step` | ~15-17 GB | LightX2V | ββββ Excellent balance |
| π― **INT8** | `int8_lightx2v_4step` | ~15-17 GB | LightX2V | ββββ Fast & efficient |
| π· **FP8 ComfyUI** | `scaled_fp8_e4m3_lightx2v_4step_comfyui` | ~15-17 GB | ComfyUI | βββ ComfyUI ready |
### π Naming Convention
```bash
# Pattern: wan2.1_{task}_{resolution}_{precision}.safetensors
# Examples:
wan2.1_i2v_720p_lightx2v_4step.safetensors # 720P I2V - BF16
wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors # 720P I2V - FP8
wan2.1_i2v_480p_int8_lightx2v_4step.safetensors # 480P I2V - INT8
wan2.1_t2v_14b_scaled_fp8_e4m3_lightx2v_4step_comfyui.safetensors # T2V - FP8 ComfyUI
```
> π‘ **Explore all models**: [Browse Full Model Collection β](https://huggingface.co/lightx2v/Wan2.1-Distill-Models/tree/main)
## π Usage
**LightX2V is a high-performance inference framework optimized for these models, approximately 2x faster than ComfyUI with better quantization accuracy. Highly recommended!**
#### Quick Start
1. Download model (720P I2V FP8 example)
```bash
huggingface-cli download lightx2v/Wan2.1-Distill-Models \
--local-dir ./models/wan2.1_i2v_720p \
--include "wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors"
```
2. Clone LightX2V repository
```bash
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
```
3. Install dependencies
```bash
pip install -r requirements.txt
```
Or refer to [Quick Start Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/quickstart.md) to use docker
4. Select and modify configuration file
Choose the appropriate configuration based on your GPU memory:
**For 80GB+ GPU (A100/H100)**
- I2V: [wan_i2v_distill_4step_cfg.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_i2v_distill_4step_cfg.json)
- T2V: [wan_t2v_distill_4step_cfg.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_t2v_distill_4step_cfg.json)
**For 24GB+ GPU (RTX 4090)**
- I2V: [wan_i2v_distill_4step_cfg_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_i2v_distill_4step_cfg_4090.json)
- T2V: [wan_t2v_distill_4step_cfg_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_t2v_distill_4step_cfg_4090.json)
5. Run inference
```
cd scripts
bash wan/run_wan_i2v_distill_4step_cfg.sh
```
#### Documentation
- **Quick Start Guide**: [LightX2V Quick Start](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/quickstart.md)
- **Complete Usage Guide**: [LightX2V Model Structure Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/model_structure.md)
- **Configuration Guide**: [Configuration Files](https://github.com/ModelTC/LightX2V/tree/main/configs/distill)
- **Quantization Usage**: [Quantization Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/method_tutorials/quantization.md)
- **Parameter Offload**: [Offload Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/method_tutorials/offload.md)
#### Performance Advantages
- β‘ **Fast**: Approximately **2x faster** than ComfyUI
- π― **Optimized**: Deeply optimized for distilled models
- πΎ **Memory Efficient**: Supports CPU offload and other memory optimization techniques
- π οΈ **Flexible**: Supports multiple quantization formats and configuration options
### Community
- **Issues**: https://github.com/ModelTC/LightX2V/issues
## β οΈ Important Notes
1. **Additional Components**: These models only contain DIT weights. You also need:
- T5 text encoder
- CLIP vision encoder
- VAE encoder/decoder
- Tokenizers
Refer to [LightX2V Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/getting_started/model_structure.md) for how to organize the complete model directory.
If you find this project helpful, please give us a β on [GitHub](https://github.com/ModelTC/LightX2V) |