Update README.md
Browse files
README.md
CHANGED
|
@@ -11,3 +11,117 @@ base_model:
|
|
| 11 |
- Wan-AI/Wan2.1-I2V-14B-720P
|
| 12 |
library_name: diffusers
|
| 13 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
- Wan-AI/Wan2.1-I2V-14B-720P
|
| 12 |
library_name: diffusers
|
| 13 |
---
|
| 14 |
+
# Wan2.1 Distilled Models
|
| 15 |
+
|
| 16 |
+
This is a collection of distilled and accelerated versions of Wan2.1 video generation models, offering multiple precision and format options. All models are optimized for **4-step inference**, dramatically improving generation speed while maintaining high-quality outputs.
|
| 17 |
+
|
| 18 |
+
## 📦 Model Overview
|
| 19 |
+
|
| 20 |
+
This repository provides multiple distilled versions of Wan2.1 models, covering different tasks, resolutions, and precisions:
|
| 21 |
+
|
| 22 |
+
### Model Types
|
| 23 |
+
|
| 24 |
+
- **Image-to-Video (I2V)**: 480P / 720P resolutions
|
| 25 |
+
- **Text-to-Video (T2V)**: 14B parameter version
|
| 26 |
+
|
| 27 |
+
### Precision Variants
|
| 28 |
+
|
| 29 |
+
Each model is available in the following precision options:
|
| 30 |
+
|
| 31 |
+
| Precision | Suffix Identifier | Size | Framework | Description |
|
| 32 |
+
|-----------|-------------------|------|-----------|-------------|
|
| 33 |
+
| **BF16** | `lightx2v_4step` | ~28-32 GB | LightX2V | Original precision, highest quality |
|
| 34 |
+
| **FP8** | `scaled_fp8_e4m3_lightx2v_4step` | ~15-17 GB | LightX2V | FP8 quantization, half size |
|
| 35 |
+
| **INT8** | `int8_lightx2v_4step` | ~15-17 GB | LightX2V | INT8 quantization, half size |
|
| 36 |
+
| **FP8 ComfyUI** | `scaled_fp8_e4m3_lightx2v_4step_comfyui` | ~15-17 GB | ComfyUI | ComfyUI compatible format |
|
| 37 |
+
|
| 38 |
+
### Naming Convention Examples
|
| 39 |
+
|
| 40 |
+
```
|
| 41 |
+
wan2.1_{task}_{resolution}_{precision}.safetensors
|
| 42 |
+
|
| 43 |
+
Examples:
|
| 44 |
+
- wan2.1_i2v_720p_lightx2v_4step.safetensors # 720P I2V original precision
|
| 45 |
+
- wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors # 720P I2V FP8 quantization
|
| 46 |
+
- wan2.1_i2v_480p_int8_lightx2v_4step.safetensors # 480P I2V INT8 quantization
|
| 47 |
+
- wan2.1_t2v_14b_scaled_fp8_e4m3_lightx2v_4step_comfyui.safetensors # T2V ComfyUI scale_fp8 format
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
> 💡 **Tip**: Browse [Files](https://huggingface.co/lightx2v/Wan2.1-Distill-Models/tree/main) to see all available models
|
| 51 |
+
|
| 52 |
+
## 🚀 Usage
|
| 53 |
+
|
| 54 |
+
**LightX2V is a high-performance inference framework optimized for these models, approximately 2x faster than ComfyUI with better quantization accuracy. Highly recommended!**
|
| 55 |
+
|
| 56 |
+
#### Quick Start
|
| 57 |
+
|
| 58 |
+
1. Download model (720P I2V FP8 example)
|
| 59 |
+
```bash
|
| 60 |
+
huggingface-cli download lightx2v/Wan2.1-Distill-Models \
|
| 61 |
+
--local-dir ./models/wan2.1_i2v_720p \
|
| 62 |
+
--include "wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors"
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
2. Clone LightX2V repository
|
| 66 |
+
|
| 67 |
+
```bash
|
| 68 |
+
git clone https://github.com/ModelTC/LightX2V.git
|
| 69 |
+
cd LightX2V
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
3. Install dependencies
|
| 73 |
+
|
| 74 |
+
```bash
|
| 75 |
+
pip install -r requirements.txt
|
| 76 |
+
```
|
| 77 |
+
Or refer to [Quick Start Documentation](https://lightx2v.readthedocs.io/en/latest/getting_started/quickstart.html) to use docker
|
| 78 |
+
|
| 79 |
+
4. Select and modify configuration file
|
| 80 |
+
|
| 81 |
+
Choose the appropriate configuration based on your GPU memory:
|
| 82 |
+
|
| 83 |
+
**For 80GB+ GPU (A100/H100)**
|
| 84 |
+
- I2V: [wan_i2v_distill_4step_cfg.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_i2v_distill_4step_cfg.json)
|
| 85 |
+
- T2V: [wan_t2v_distill_4step_cfg.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_t2v_distill_4step_cfg.json)
|
| 86 |
+
|
| 87 |
+
**For 24GB+ GPU (RTX 4090/3090)**
|
| 88 |
+
- I2V: [wan_i2v_distill_4step_cfg_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_i2v_distill_4step_cfg_4090.json)
|
| 89 |
+
- T2V: [wan_t2v_distill_4step_cfg_4090.json](https://github.com/ModelTC/LightX2V/blob/main/configs/distill/wan_t2v_distill_4step_cfg_4090.json)
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
5. Run inference
|
| 93 |
+
```
|
| 94 |
+
cd scripts
|
| 95 |
+
bash wan/run_wan_i2v_distill_4step_cfg.sh
|
| 96 |
+
```
|
| 97 |
+
|
| 98 |
+
#### Documentation
|
| 99 |
+
- **Quick Start Guide**: [LightX2V Quick Start](https://lightx2v.readthedocs.io/en/latest/getting_started/quickstart.html)
|
| 100 |
+
- **Complete Usage Guide**: [LightX2V Model Structure Documentation](https://lightx2v.readthedocs.io/en/latest/getting_started/model_structure.html)
|
| 101 |
+
- **Configuration Guide**: [Configuration Files](https://github.com/ModelTC/LightX2V/tree/main/configs/distill)
|
| 102 |
+
- **Quantization Usage**: [Quantization Documentation](https://lightx2v.readthedocs.io/en/latest/method_tutorials/quantization.html)
|
| 103 |
+
- **Parameter Offload**: [Offload Documentation](https://lightx2v.readthedocs.io/en/latest/method_tutorials/offload.html)
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
#### Performance Advantages
|
| 107 |
+
|
| 108 |
+
- ⚡ **Fast**: Approximately **2x faster** than ComfyUI
|
| 109 |
+
- 🎯 **Optimized**: Deeply optimized for distilled models
|
| 110 |
+
- 💾 **Memory Efficient**: Supports CPU offload and other memory optimization techniques
|
| 111 |
+
- 🛠️ **Flexible**: Supports multiple quantization formats and configuration options
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
### Community
|
| 115 |
+
- **Issues**: https://github.com/ModelTC/LightX2V/issues
|
| 116 |
+
- **Discussions**: https://github.com/ModelTC/LightX2V/discussions
|
| 117 |
+
|
| 118 |
+
## ⚠️ Important Notes
|
| 119 |
+
|
| 120 |
+
1. **Additional Components**: These models only contain DIT weights. You also need:
|
| 121 |
+
- T5 text encoder
|
| 122 |
+
- CLIP vision encoder
|
| 123 |
+
- VAE encoder/decoder
|
| 124 |
+
- Tokenizers
|
| 125 |
+
|
| 126 |
+
Refer to [LightX2V Documentation](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/deploy_guides/model_structure.md) for how to organize the complete model directory.
|
| 127 |
+
|