Update modelcard: add I2V 4-step lora V1 & T2V 4-step lora V1.1
Browse files
README.md
CHANGED
|
@@ -1,6 +1,10 @@
|
|
| 1 |
---
|
| 2 |
language: en
|
| 3 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
# Wan2.2-Lightning
|
|
@@ -13,9 +17,54 @@ We are excited to release the distilled version of <a href="https://wan.video"><
|
|
| 13 |
- **High-quality**: The distilled model delivers visuals on par with the base model in most scenarios, sometimes even better.
|
| 14 |
- **Complex Motion Generation**: Despite the reduction to just 4 steps, the model retains excellent motion dynamics in the generated scenes.
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
## Video Demos
|
| 18 |
-
### Wan2.2-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
The videos below can be reproduced using [examples/prompt_list.txt](examples/prompt_list.txt).
|
| 21 |
|
|
@@ -74,16 +123,12 @@ In some results, the direction of the vehicles may be reversed.
|
|
| 74 |
</tr>
|
| 75 |
</table>
|
| 76 |
|
| 77 |
-
## π₯ Latest News!!
|
| 78 |
-
|
| 79 |
-
* Aug 04, 2025: π We have open the [Wan2.2-T2V-A14B-NFE4](https://hf-mirror.com/lightx2v/Wan2.2-Lightning). Enjoy!
|
| 80 |
-
- [Kijai's ComfyUI WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper) is an implementation of Wan models for ComfyUI. Thanks to its Wan-only focus, it's on the frontline of getting cutting edge optimizations and hot research features.
|
| 81 |
|
| 82 |
|
| 83 |
## π Todo List
|
| 84 |
- [x] Wan2.2-T2V-A14B-4steps
|
|
|
|
| 85 |
- [ ] Wan2.2-TI2V-5B-4steps
|
| 86 |
-
- [ ] Wan2.2-I2V-A14B-4steps
|
| 87 |
|
| 88 |
## π Run Wan2.2-Lightning
|
| 89 |
|
|
@@ -160,7 +205,7 @@ DASH_API_KEY=your_key torchrun --nproc_per_node=8 generate.py --task t2v-A14B -
|
|
| 160 |
torchrun --nproc_per_node=8 generate.py --task t2v-A14B --size 1280*720 --ckpt_dir ./Wan2.2-T2V-A14B --lora_dir ./Wan2.2-Lightning/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V1 --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'local_qwen' --prompt_extend_target_lang 'zh'
|
| 161 |
```
|
| 162 |
|
| 163 |
-
|
| 164 |
#### Run Image-to-Video Generation
|
| 165 |
|
| 166 |
This repository supports the `Wan2.2-I2V-A14B` Image-to-Video model and can simultaneously support video generation at 480P and 720P resolutions.
|
|
@@ -168,7 +213,7 @@ This repository supports the `Wan2.2-I2V-A14B` Image-to-Video model and can simu
|
|
| 168 |
|
| 169 |
- Single-GPU inference
|
| 170 |
```sh
|
| 171 |
-
python generate.py
|
| 172 |
```
|
| 173 |
|
| 174 |
> This command can run on a GPU with at least 80GB VRAM.
|
|
@@ -179,9 +224,10 @@ python generate.py --task i2v-A14B --size 1280*720 --ckpt_dir ./Wan2.2-I2V-A14B
|
|
| 179 |
- Multi-GPU inference using FSDP + DeepSpeed Ulysses
|
| 180 |
|
| 181 |
```sh
|
| 182 |
-
torchrun --nproc_per_node=8 generate.py --task i2v-A14B --size 1280*720 --ckpt_dir ./Wan2.2-I2V-A14B --
|
| 183 |
```
|
| 184 |
|
|
|
|
| 185 |
- Image-to-Video Generation without prompt
|
| 186 |
|
| 187 |
```sh
|
|
@@ -225,7 +271,8 @@ python generate.py --task ti2v-5B --size 1280*704 --ckpt_dir ./Wan2.2-TI2V-5B --
|
|
| 225 |
torchrun --nproc_per_node=8 generate.py --task ti2v-5B --size 1280*704 --ckpt_dir ./Wan2.2-TI2V-5B --dit_fsdp --t5_fsdp --ulysses_size 8 --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
| 226 |
```
|
| 227 |
|
| 228 |
-
> The process of prompt extension can be referenced [here](#2-using-prompt-
|
|
|
|
| 229 |
|
| 230 |
|
| 231 |
|
|
@@ -239,4 +286,5 @@ We built upon and reused code from the following projects: [Wan2.1](https://gith
|
|
| 239 |
|
| 240 |
We also adopt the evaluation text prompts from [Movie Gen Bench](https://github.com/facebookresearch/MovieGenBench), which is licensed under the Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0) License. The original license can be found [here](https://github.com/facebookresearch/MovieGenBench/blob/main/LICENSE).
|
| 241 |
|
| 242 |
-
The selected prompts are further enhanced using the `Qwen/Qwen2.5-14B-Instruct`model [Qwen](https://huggingface.co/Qwen).
|
|
|
|
|
|
| 1 |
---
|
| 2 |
language: en
|
| 3 |
license: apache-2.0
|
| 4 |
+
base_model:
|
| 5 |
+
- Wan-AI/Wan2.2-T2V-A14B
|
| 6 |
+
- Wan-AI/Wan2.2-I2V-A14B
|
| 7 |
+
- Wan-AI/Wan2.2-TI2V-5B
|
| 8 |
---
|
| 9 |
|
| 10 |
# Wan2.2-Lightning
|
|
|
|
| 17 |
- **High-quality**: The distilled model delivers visuals on par with the base model in most scenarios, sometimes even better.
|
| 18 |
- **Complex Motion Generation**: Despite the reduction to just 4 steps, the model retains excellent motion dynamics in the generated scenes.
|
| 19 |
|
| 20 |
+
## π₯ Latest News!!
|
| 21 |
+
* Aug 07, 2025: π We have open the [Wan2.2-I2V-A14B-NFE4-V1](https://hf-mirror.com/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1). A [workflow](https://hf-mirror.com/lightx2v/Wan2.2-Lightning/blob/main/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1-forKJ.json) compatible with [Kijai's ComfyUI WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper) is inside this link. Enjoy!
|
| 22 |
+
* Aug 07, 2025: π We have open the [Wan2.2-T2V-A14B-NFE4-V1.1](https://hf-mirror.com/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V1.1). A [workflow](https://hf-mirror.com/lightx2v/Wan2.2-Lightning/blob/main/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V1.1/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V1.1-forKJ.json) compatible with [Kijai's ComfyUI WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper) is inside this link. The generation quality of V1.1 is slightly better than V1. Enjoy!
|
| 23 |
+
* Aug 04, 2025: π We have open the [Wan2.2-T2V-A14B-NFE4-V1](https://hf-mirror.com/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V1). Enjoy!
|
| 24 |
+
- [Kijai's ComfyUI WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper) is an implementation of Wan models for ComfyUI. Thanks to its Wan-only focus, it's on the frontline of getting cutting edge optimizations and hot research features.
|
| 25 |
|
| 26 |
## Video Demos
|
| 27 |
+
### Wan2.2-I2V-A14B-NFE4-V1 Demo
|
| 28 |
+
|
| 29 |
+
The videos below can be reproduced using [examples/i2v_prompt_list.txt](examples/i2v_prompt_list.txt) and [examples/i2v_image_path_list.txt](examples/i2v_image_path_list.txt).
|
| 30 |
+
|
| 31 |
+
<table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
|
| 32 |
+
<tr>
|
| 33 |
+
<td>
|
| 34 |
+
<video src="https://github.com/user-attachments/assets/4f6bb1e0-9e2b-4eb2-8b9f-0678ccd5b4ec" width="100%" controls loop></video>
|
| 35 |
+
</td>
|
| 36 |
+
<td>
|
| 37 |
+
<video src="https://github.com/user-attachments/assets/bb249553-3f52-40b3-88f9-6e3bca1a8358" width="100%" controls loop></video>
|
| 38 |
+
</td>
|
| 39 |
+
<td>
|
| 40 |
+
<video src="https://github.com/user-attachments/assets/17a6d26a-dd63-47ef-9a98-1502f503dfba" width="100%" controls loop></video>
|
| 41 |
+
</td>
|
| 42 |
+
</tr>
|
| 43 |
+
<tr>
|
| 44 |
+
<td>
|
| 45 |
+
<video src="https://github.com/user-attachments/assets/6ccc69cf-e129-456f-8b93-6dc709cb0ede" width="100%" controls loop></video>
|
| 46 |
+
</td>
|
| 47 |
+
<td>
|
| 48 |
+
<video src="https://github.com/user-attachments/assets/6cf9c586-f37a-47ed-ab5b-e106c3877fa8" width="100%" controls loop></video>
|
| 49 |
+
</td>
|
| 50 |
+
<td>
|
| 51 |
+
<video src="https://github.com/user-attachments/assets/27e82fdf-88af-44ac-b987-b48aa3f9f793" width="100%" controls loop></video>
|
| 52 |
+
</td>
|
| 53 |
+
</tr>
|
| 54 |
+
<tr>
|
| 55 |
+
<td>
|
| 56 |
+
<video src="https://github.com/user-attachments/assets/36a76f1d-2b64-4b16-a862-210d0ffd6d55" width="100%" controls loop></video>
|
| 57 |
+
</td>
|
| 58 |
+
<td>
|
| 59 |
+
<video src="https://github.com/user-attachments/assets/4bc36c70-931e-4539-be8c-432d832819d3" width="100%" controls loop></video>
|
| 60 |
+
</td>
|
| 61 |
+
<td>
|
| 62 |
+
<video src="https://github.com/user-attachments/assets/488b9179-741b-4b9d-8f23-895981f054cb" width="100%" controls loop></video>
|
| 63 |
+
</td>
|
| 64 |
+
</tr>
|
| 65 |
+
</table>
|
| 66 |
+
|
| 67 |
+
### Wan2.2-T2V-A14B-NFE4-V1 Demo
|
| 68 |
|
| 69 |
The videos below can be reproduced using [examples/prompt_list.txt](examples/prompt_list.txt).
|
| 70 |
|
|
|
|
| 123 |
</tr>
|
| 124 |
</table>
|
| 125 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 126 |
|
| 127 |
|
| 128 |
## π Todo List
|
| 129 |
- [x] Wan2.2-T2V-A14B-4steps
|
| 130 |
+
- [x] Wan2.2-I2V-A14B-4steps
|
| 131 |
- [ ] Wan2.2-TI2V-5B-4steps
|
|
|
|
| 132 |
|
| 133 |
## π Run Wan2.2-Lightning
|
| 134 |
|
|
|
|
| 205 |
torchrun --nproc_per_node=8 generate.py --task t2v-A14B --size 1280*720 --ckpt_dir ./Wan2.2-T2V-A14B --lora_dir ./Wan2.2-Lightning/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V1 --dit_fsdp --t5_fsdp --ulysses_size 8 --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage" --use_prompt_extend --prompt_extend_method 'local_qwen' --prompt_extend_target_lang 'zh'
|
| 206 |
```
|
| 207 |
|
| 208 |
+
|
| 209 |
#### Run Image-to-Video Generation
|
| 210 |
|
| 211 |
This repository supports the `Wan2.2-I2V-A14B` Image-to-Video model and can simultaneously support video generation at 480P and 720P resolutions.
|
|
|
|
| 213 |
|
| 214 |
- Single-GPU inference
|
| 215 |
```sh
|
| 216 |
+
python generate.py --task i2v-A14B --size "1280*720" --ckpt_dir ./Wan2.2-I2V-A14B --lora_dir ./Wan2.2-Lightning/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1 --offload_model True --base_seed 42 --prompt_file examples/i2v_prompt_list.txt --image_path_file examples/i2v_image_path_list.txt
|
| 217 |
```
|
| 218 |
|
| 219 |
> This command can run on a GPU with at least 80GB VRAM.
|
|
|
|
| 224 |
- Multi-GPU inference using FSDP + DeepSpeed Ulysses
|
| 225 |
|
| 226 |
```sh
|
| 227 |
+
torchrun --nproc_per_node=8 generate.py --task i2v-A14B --size 1280*720 --ckpt_dir ./Wan2.2-I2V-A14B --lora_dir ./Wan2.2-Lightning/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1 --dit_fsdp --t5_fsdp --ulysses_size 8 --base_seed 42 --prompt_file examples/i2v_prompt_list.txt --image_path_file examples/i2v_image_path_list.txt
|
| 228 |
```
|
| 229 |
|
| 230 |
+
<!--
|
| 231 |
- Image-to-Video Generation without prompt
|
| 232 |
|
| 233 |
```sh
|
|
|
|
| 271 |
torchrun --nproc_per_node=8 generate.py --task ti2v-5B --size 1280*704 --ckpt_dir ./Wan2.2-TI2V-5B --dit_fsdp --t5_fsdp --ulysses_size 8 --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
|
| 272 |
```
|
| 273 |
|
| 274 |
+
> The process of prompt extension can be referenced [here](#2-using-prompt-extension).
|
| 275 |
+
-->
|
| 276 |
|
| 277 |
|
| 278 |
|
|
|
|
| 286 |
|
| 287 |
We also adopt the evaluation text prompts from [Movie Gen Bench](https://github.com/facebookresearch/MovieGenBench), which is licensed under the Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0) License. The original license can be found [here](https://github.com/facebookresearch/MovieGenBench/blob/main/LICENSE).
|
| 288 |
|
| 289 |
+
The selected prompts are further enhanced using the `Qwen/Qwen2.5-14B-Instruct`model [Qwen](https://huggingface.co/Qwen).
|
| 290 |
+
|