Update README.md
Browse files
README.md
CHANGED
|
@@ -178,17 +178,225 @@ python generate.py \
|
|
| 178 |
Looking forward to the Gradio launch soon to support everyone in freely creating their own videos.
|
| 179 |
|
| 180 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 181 |
## Citation
|
| 182 |
If you find our work helpful, please cite us.
|
| 183 |
|
| 184 |
```
|
| 185 |
@article{chu2025wan,
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
|
| 189 |
-
|
| 190 |
-
archivePrefix={arXiv},
|
| 191 |
-
primaryClass={cs.CV}
|
| 192 |
}
|
| 193 |
```
|
| 194 |
|
|
|
|
| 178 |
Looking forward to the Gradio launch soon to support everyone in freely creating their own videos.
|
| 179 |
|
| 180 |
|
| 181 |
+
## Citation
|
| 182 |
+
If you find our work helpful, please cite us.
|
| 183 |
+
|
| 184 |
+
```
|
| 185 |
+
# Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
|
| 186 |
+
|
| 187 |
+
[](https://arxiv.org/abs/2512.08765)
|
| 188 |
+
[](https://github.com/ali-vilab/Wan-Move)
|
| 189 |
+
[](https://huggingface.co/Ruihang/Wan-Move-14B-480P)
|
| 190 |
+
[](https://modelscope.cn/models/churuihang/Wan-Move-14B-480P)
|
| 191 |
+
[](https://huggingface.co/datasets/Ruihang/MoveBench)
|
| 192 |
+
[](https://www.youtube.com/watch?v=_5Cy7Z2NQJQ)
|
| 193 |
+
[](https://wan-move.github.io/)
|
| 194 |
+
|
| 195 |
+
<div align="center">
|
| 196 |
+
|
| 197 |
+
[](https://www.youtube.com/watch?v=_5Cy7Z2NQJQ)
|
| 198 |
+
|
| 199 |
+
</div>
|
| 200 |
+
|
| 201 |
+
## π‘ TLDR: Bring Wan I2V to SOTA fine-grained, point-level motion control!
|
| 202 |
+
|
| 203 |
+
**Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance [[Paper](https://arxiv.org/abs/2512.08765)]** <br />
|
| 204 |
+
[Ruihang Chu](https://scholar.google.com/citations?hl=zh-CN&user=62zPPxkAAAAJ), [Yefei He](https://hexy.tech/), [Zhekai Chen](https://scholar.google.com/citations?user=_eZWcIMAAAAJ), [Shiwei Zhang](https://scholar.google.com/citations?user=ZO3OQ-8AAAAJ), [Xiaogang Xu](https://xuxiaogang.com/), [Bin Xia](https://zj-binxia.github.io/), [Dingdong Wang](https://scholar.google.com/citations?user=hRWxWiEAAAAJ), [Hongwei Yi](https://scholar.google.com/citations?user=ocMf7fQAAAAJ), [Xihui Liu](https://xh-liu.github.io/), [Hengshuang Zhao](https://hszhao.github.io/), [Yu Liu](https://scholar.google.com/citations?user=8zksQb4AAAAJ), [Yingya Zhang](https://scholar.google.com/citations?user=16RDSEUAAAAJ), [Yujiu Yang](https://sites.google.com/view/iigroup-thu/about) <br />
|
| 205 |
+
|
| 206 |
+
We present our NeurIPS 2025 paper Wan-Move, a simple and scalable motion-control framework for video generation. Wan-Move offers the following key features:
|
| 207 |
+
- π― **High-Quality 5s 480p Motion Control**: Through scaled training, Wan-Move can generate 5-second, 480p videos with SOTA motion controllability on par with commercial systems such as Kling 1.5 Proβs Motion Brush, as verified via user studies.
|
| 208 |
+
- π§© **Novel latent Trajectory Guidance**: Our core idea is to represent the motion condition by propagating the first frameβs features along the trajectory, which can be seamlessly integrated into off-the-shelf image-to-video models (e.g., Wan-I2V-14B) without any architecture change or extra motion modules.
|
| 209 |
+
|
| 210 |
+
- πΉοΈ **Fine-grained Point-level Control**: Object motions are represented with dense point trajectories, enabling precise, region-level control over how each element in the scene moves.
|
| 211 |
+
|
| 212 |
+
- π **Dedicated Motion-control Benchmark MoveBench**: MoveBench is a carefully curated benchmark with larger-scale samples, diverse content categories, longer video durations, and high-quality trajectory annotations.
|
| 213 |
+
|
| 214 |
+
π Weβre glad to see Wan-Move being tested in real-world videos by many creators and users.
|
| 215 |
+
|
| 216 |
+
## π₯ Latest News!!
|
| 217 |
+
|
| 218 |
+
* Dec 15, 2025: π We've released a [local Gradio demo](#gradio-demo) for interactive trajectory drawing and video generation.
|
| 219 |
+
* Dec 10, 2025: π We've released the [inference code](#quickstart), [model weights](https://huggingface.co/Ruihang/Wan-Move-14B-480P), and [MoveBench](https://huggingface.co/datasets/Ruihang/MoveBench) of Wan-Move.
|
| 220 |
+
* Sep 18, 2025: π Wan-Move has been accepted by NeurIPS 2025! πππ
|
| 221 |
+
|
| 222 |
+
## Community Works
|
| 223 |
+
* **[ComfyUI]** Thank Kijai for integrating Wan-Move into the ComfyUI wrapper: [https://huggingface.co/Kijai/WanVideo_comfy_fp8_scaled/tree/main/WanMove](https://huggingface.co/Kijai/WanVideo_comfy_fp8_scaled/tree/main/WanMove)
|
| 224 |
+
|
| 225 |
+
* Thanks deepbeepmeep for supporting Wan-Move in Wan2GP, requiring low VRAM for video generation: https://github.com/deepbeepmeep/Wan2GP
|
| 226 |
+
|
| 227 |
+
|
| 228 |
+
## π Todo List
|
| 229 |
+
- Wan-Move-480P
|
| 230 |
+
- [x] Multi-GPU inference code of the 14B models
|
| 231 |
+
- [x] Checkpoints of the 14B models
|
| 232 |
+
- [x] Data and evaluation code of MoveBench
|
| 233 |
+
- [x] Gradio demo
|
| 234 |
+
|
| 235 |
+
|
| 236 |
+
|
| 237 |
+
## Introduction of Wan-Move
|
| 238 |
+
|
| 239 |
+
|
| 240 |
+
|
| 241 |
+
<p align="center" style="border-radius: 10px">
|
| 242 |
+
<img src="assets/overview.png" width="100%" alt="logo"/>
|
| 243 |
+
<strong>Wan-Move spports diverse motion control applications in image-to-video generation. The generated samples (832Γ480p, 5s) exhibits high visual fidelity and accurate motion.</strong>
|
| 244 |
+
</p>
|
| 245 |
+
|
| 246 |
+
<p align="center" style="border-radius: 10px">
|
| 247 |
+
<img src="assets/framework.png" width="100%" alt="logo"/>
|
| 248 |
+
<strong>The framework of Wan-Move. (a) How to inject motion guidance. (b) Training pipeline. </strong>
|
| 249 |
+
</p>
|
| 250 |
+
|
| 251 |
+
<p align="center" style="border-radius: 10px">
|
| 252 |
+
<img src="assets/movebench.png" width="100%" alt="logo"/>
|
| 253 |
+
<strong>The contruction pipeline and statistics of MoveBench. Welcome everyone to use it! </strong>
|
| 254 |
+
</p>
|
| 255 |
+
|
| 256 |
+
<p align="center" style="border-radius: 10px">
|
| 257 |
+
<img src="assets/main-comparison.png" width="100%" alt="logo"/>
|
| 258 |
+
<strong>Qualitative comparisons between Wan-Move and academic methods and commercial solutions. </strong>
|
| 259 |
+
</p>
|
| 260 |
+
|
| 261 |
+
|
| 262 |
+
|
| 263 |
+
## Quickstart
|
| 264 |
+
|
| 265 |
+
#### Installation
|
| 266 |
+
|
| 267 |
+
> π‘Note: Wan-Move is implemented as a minimal extension on top of the [Wan2.1](https://github.com/Wan-Video/Wan2.1) codebase. If you have tried Wan2.1, you can reuse most of your existing setup with very low migration cost.
|
| 268 |
+
|
| 269 |
+
Clone the repo:
|
| 270 |
+
```sh
|
| 271 |
+
git clone https://github.com/ali-vilab/Wan-Move.git
|
| 272 |
+
cd Wan-Move
|
| 273 |
+
```
|
| 274 |
+
|
| 275 |
+
Install dependencies:
|
| 276 |
+
```sh
|
| 277 |
+
# Ensure torch >= 2.4.0
|
| 278 |
+
pip install -r requirements.txt
|
| 279 |
+
```
|
| 280 |
+
|
| 281 |
+
|
| 282 |
+
#### Model Download
|
| 283 |
+
|
| 284 |
+
| Models | Download Link | Notes |
|
| 285 |
+
|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------|
|
| 286 |
+
| Wan-Move-14B-480P | π€ [Huggingface](https://huggingface.co/Ruihang/Wan-Move-14B-480P) π€ [ModelScope](https://modelscope.cn/models/churuihang/Wan-Move-14B-480P) | 5s 480P video generation
|
| 287 |
+
|
| 288 |
+
|
| 289 |
+
Download models using huggingface-cli:
|
| 290 |
+
``` sh
|
| 291 |
+
pip install "huggingface_hub[cli]"
|
| 292 |
+
huggingface-cli download Ruihang/Wan-Move-14B-480P --local-dir ./Wan-Move-14B-480P
|
| 293 |
+
```
|
| 294 |
+
|
| 295 |
+
Download models using modelscope-cli:
|
| 296 |
+
``` sh
|
| 297 |
+
pip install modelscope
|
| 298 |
+
modelscope download churuihang/Wan-Move-14B-480P --local_dir ./Wan-Move-14B-480P
|
| 299 |
+
```
|
| 300 |
+
#### Evaluation on MoveBench
|
| 301 |
+
|
| 302 |
+
Download MoveBench from Hugging Face
|
| 303 |
+
``` sh
|
| 304 |
+
huggingface-cli download Ruihang/MoveBench --local-dir ./MoveBench --repo-type dataset
|
| 305 |
+
```
|
| 306 |
+
|
| 307 |
+
> π‘Note:
|
| 308 |
+
> * MoveBench has provided the video captions. For a fair evaluation, you should turn off the [prompt extension](https://github.com/Wan-Video/Wan2.1?tab=readme-ov-file#2-using-prompt-extension-1) function developed in Wan2.1.
|
| 309 |
+
> * MoveBench provides both data in English and Chinese versions. You can select the language via the `--language` flag: use `en` for English and `zh` for Chinese.
|
| 310 |
+
|
| 311 |
+
- Single-GPU inference
|
| 312 |
+
|
| 313 |
+
``` sh
|
| 314 |
+
# For single-object motion test, run:
|
| 315 |
+
python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode single --language en --save_path results/en --eval_bench
|
| 316 |
+
|
| 317 |
+
# For multi-object motion test, run:
|
| 318 |
+
python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode multi --language en --save_path results/en --eval_bench
|
| 319 |
+
```
|
| 320 |
+
|
| 321 |
+
> π‘Note:
|
| 322 |
+
> * If you want to visualize the trajectory motion effect in our video demo, add the `--vis_track` flag. We also provide a separate visualization script, i.e., `scripts/visualize.py`, to support different visualization settings, for example, enabling mouse-button effects! πππ
|
| 323 |
+
> * If you encounter OOM (Out-of-Memory) issues, you can use the `--offload_model True` and `--t5_cpu` options to reduce GPU memory usage.
|
| 324 |
+
> * The 14B model can be run in a **single 40GB** GPU with `--t5_cpu --offload_model True --dtype bf16`! π€π€π€
|
| 325 |
+
|
| 326 |
+
|
| 327 |
+
- Multi-GPU inference
|
| 328 |
+
|
| 329 |
+
Following Wan2.1, Wan-Move also supports FSDP and [xDiT](https://github.com/xdit-project/xDiT) USP to accelerate inference. When running multi-GPU batch evaluation (e.g., evaluating MoveBench or a file containing multiple test cases), you should **disable** the [`Ulysses`](https://arxiv.org/abs/2309.14509) strategy by setting `--ulysses_size 1`. Ulysses is only supported when generating a single video with multi-GPU inference.
|
| 330 |
+
|
| 331 |
+
``` sh
|
| 332 |
+
# For single-object motion test, run:
|
| 333 |
+
torchrun --nproc_per_node=8 generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode single --language en --save_path results/en --eval_bench --dit_fsdp --t5_fsdp
|
| 334 |
+
|
| 335 |
+
# For multi-object motion test, run:
|
| 336 |
+
torchrun --nproc_per_node=8 generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode multi --language en --save_path results/en --eval_bench --dit_fsdp --t5_fsdp
|
| 337 |
+
```
|
| 338 |
+
After all results are generated, you can change the results storage path inside `MoveBench/bench.py`, then run:
|
| 339 |
+
|
| 340 |
+
``` sh
|
| 341 |
+
python MoveBench/bench.py
|
| 342 |
+
```
|
| 343 |
+
|
| 344 |
+
#### Run the Default Example
|
| 345 |
+
|
| 346 |
+
For single video generation, (not evaluating MoveBench), we also provide
|
| 347 |
+
a sample case in the `examples` folder. You can directly run:
|
| 348 |
+
|
| 349 |
+
```sh
|
| 350 |
+
python generate.py \
|
| 351 |
+
--task wan-move-i2v \
|
| 352 |
+
--size 480*832 \
|
| 353 |
+
--ckpt_dir ./Wan-Move-14B-480P \
|
| 354 |
+
--image examples/example.jpg \
|
| 355 |
+
--track examples/example_tracks.npy \
|
| 356 |
+
--track_visibility examples/example_visibility.npy \
|
| 357 |
+
--prompt "A laptop is placed on a wooden table. The silver laptop is connected to a small grey external hard drive and transfers data through a white USB-C cable. The video is shot with a downward close-up lens." \
|
| 358 |
+
--save_file example.mp4
|
| 359 |
+
```
|
| 360 |
+
#### Gradio Demo
|
| 361 |
+
We provide a local Gradio demo for interactive trajectory drawing and video generation.
|
| 362 |
+
|
| 363 |
+
1. **Launch the Demo**:
|
| 364 |
+
```bash
|
| 365 |
+
python gradio_app.py \
|
| 366 |
+
--task wan-move-i2v \
|
| 367 |
+
--size 480*832 \
|
| 368 |
+
--ckpt_dir ./Wan-Move-14B-480P \
|
| 369 |
+
--t5_cpu \
|
| 370 |
+
--offload_model True \
|
| 371 |
+
--dtype bf16 \
|
| 372 |
+
--port 7860 \
|
| 373 |
+
--share
|
| 374 |
+
```
|
| 375 |
+
|
| 376 |
+
2. **Features**:
|
| 377 |
+
* **Multi-Trajectory Control**: Draw multiple trajectories with distinct colors.
|
| 378 |
+
* **Speed Control**: Adjust the speed curve for each trajectory independently.
|
| 379 |
+
* **Real-time Preview**: Visualize your drawn trajectories on the input image and as a GIF.
|
| 380 |
+
* **Lazy Loading**: The model loads only when you start generation, ensuring fast startup.
|
| 381 |
+
* **History Gallery**: View your previously generated videos.
|
| 382 |
+
|
| 383 |
+
3. **Usage**:
|
| 384 |
+
* Upload an image.
|
| 385 |
+
* Click on the image to add trajectory points.
|
| 386 |
+
* (Optional) Adjust the speed curve in the editor.
|
| 387 |
+
* Select "Create New..." in the dropdown to add more trajectories.
|
| 388 |
+
* Click "Generate Video".
|
| 389 |
+
|
| 390 |
+
|
| 391 |
## Citation
|
| 392 |
If you find our work helpful, please cite us.
|
| 393 |
|
| 394 |
```
|
| 395 |
@article{chu2025wan,
|
| 396 |
+
title={Wan-move: Motion-controllable video generation via latent trajectory guidance},
|
| 397 |
+
author={Chu, Ruihang and He, Yefei and Chen, Zhekai and Zhang, Shiwei and Xu, Xiaogang and Xia, Bin and Wang, Dingdong and Yi, Hongwei and Liu, Xihui and Zhao, Hengshuang and others},
|
| 398 |
+
journal={arXiv preprint arXiv:2512.08765},
|
| 399 |
+
year={2025}
|
|
|
|
|
|
|
| 400 |
}
|
| 401 |
```
|
| 402 |
|