Shuaishuai0219 commited on
Commit
59be61d
·
verified ·
1 Parent(s): b2d4cbb

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +186 -0
README.md ADDED
@@ -0,0 +1,186 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ <p align="center">
3
+ <h2 align="center">Animate-X++: Universal Character Image Animation with Dynamic Backgrounds</h2>
4
+
5
+
6
+ <p align="center">
7
+ <a href=""><strong>Shuai Tan</strong></a>
8
+ ·
9
+ <a href="https://scholar.google.com/citations?user=BwdpTiQAAAAJ"><strong>Biao Gong</strong></a>
10
+ ·
11
+ <a href=""><strong>Zhuoxin Liu</strong></a>
12
+ ·
13
+ <a href="https://scholar.google.com/citations?user=f6FgQ_bXEb4C&hl=en"><strong>Yan Wang</strong></a>
14
+ <a href="https://scholar.google.com/citations?user=WntYF-sAAAAJ&hl=en&oi=ao"><strong>Yifan
15
+ Feng</strong></a>
16
+ ·
17
+ <a href="https://xavierchen34.github.io/"><strong>Xi Chen</strong></a>
18
+ ·
19
+ <a href="https://hszhao.github.io/"><strong>Hengshuang
20
+ Zhao</strong></a><sup>†</sup>
21
+ <br>
22
+ <br>
23
+ <a href="https://arxiv.org/abs/2508.09454"><img src='https://img.shields.io/badge/arXiv-Animate--X++-red' alt='Paper PDF'></a>
24
+ <a href='https://lucaria-academy.github.io/Animate-X++/'><img src='https://img.shields.io/badge/Project_Page-Animate--X++-blue' alt='Project Page'></a>
25
+ <a href='https://mp.weixin.qq.com/s/vDR4kPLqnCUwfPiBNKKV9A'><img src='https://badges.aleen42.com/src/wechat.svg'></a>
26
+ <a href='https://huggingface.co/Shuaishuai0219/Animate-X-plusplus'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow'></a>
27
+ <br>
28
+ <b></a>HKU&nbsp; | &nbsp; </a>Ant Group </b>
29
+ <br>
30
+ </p>
31
+
32
+ </p>
33
+
34
+ This repository is the official implementation of paper "Animate-X++: Universal Character Image Animation with Dynamic Backgrounds". Animate-X++ is a universal animation framework based on latent diffusion models for various character types (collectively named X), including anthropomorphic characters.
35
+ <table align="center">
36
+ <tr>
37
+ <td>
38
+ <img src="assets/images/teaser.png">
39
+ </td>
40
+ </tr>
41
+ </table>
42
+
43
+
44
+ ## &#x1F4CC; Updates
45
+ * [2025.09.17] 🔥 We release our [Animate-X++](https://github.com/Lucaria-Academy/Animate-X-plusplus) inference codes.
46
+ * [2025.09.17] 🔥 We release our [Animate-X++ CKPT](https://huggingface.co/Shuaishuai0219/Animate-X-plusplus) checkpoints.
47
+ * [2025.08.12] 🔥 Our [paper](https://arxiv.org/abs/2508.09454) is in public on arxiv.
48
+
49
+
50
+
51
+ <!-- <video controls loop src="https://cloud.video.taobao.com/vod/vs4L24EAm6IQ5zM3SbN5AyHCSqZIXwmuobrzqNztMRM.mp4" muted="false"></video> -->
52
+
53
+ ## &#x1F304; Gallery
54
+ <!-- ### Introduction
55
+ <table class="center">
56
+ <tr>
57
+ <td width=47% style="border: none">
58
+ <video controls loop src="https://github.com/user-attachments/assets/085b70c4-cb68-4ac1-b45f-ed7f1c75bd5c" muted="false"></video>
59
+ </td>
60
+ <td width=53% style="border: none">
61
+ <video controls loop src="https://github.com/user-attachments/assets/f6275c0d-fbca-43b4-b6d6-cf095723729e" muted="false"></video>
62
+ </td>
63
+ </tr>
64
+ </table> -->
65
+
66
+ ### Animations produced by Animate-X++
67
+ <table class="center">
68
+ <tr>
69
+ <td width=50% style="border: none">
70
+ <video controls loop src="https://cloud.video.taobao.com/vod/i18qjxKlFXgdcVfNC5XsQy3hHVlt5w2QJbK7UyobGEQ.mp4" muted="false"></video>
71
+ </td>
72
+ <td width=50% style="border: none">
73
+ <video controls loop src="https://cloud.video.taobao.com/vod/b_C5y51HxQ9zZfABcT0WpS81_xl1HLWdemEz5QEBl14.mp4" muted="false"></video>
74
+ </td>
75
+ </tr>
76
+ </table>
77
+
78
+
79
+
80
+
81
+ ## &#x1F680; Installation
82
+ Install with `conda`:
83
+ ```shell
84
+ conda create -n Animate-X++ python=3.9.21
85
+ # or conda create -n Animate-X++ python=3.10.16 # Python>=3.10 is required for Unified Sequence Parallel (USP)
86
+ conda activate Animate-X++
87
+
88
+ # CUDA 11.8
89
+ pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu118
90
+ # CUDA 12.1
91
+ pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu121
92
+ # CUDA 12.4
93
+ pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu124
94
+
95
+ git clone https://github.com/Lucaria-Academy/Animate-X++.git
96
+ cd Animate-X++
97
+ pip install -e .
98
+ ```
99
+
100
+ UniAnimate-DiT supports multiple Attention implementations. If you have installed any of the following Attention implementations, they will be enabled based on priority.
101
+
102
+ * [Flash Attention 3](https://github.com/Dao-AILab/flash-attention)
103
+ * [Flash Attention 2](https://github.com/Dao-AILab/flash-attention)
104
+ * [Sage Attention](https://github.com/thu-ml/SageAttention)
105
+ * [torch SDPA](https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html) (default. `torch>=2.5.0` is recommended.)
106
+
107
+ ## &#x1F680; Download Checkpoints
108
+
109
+ (i) Download Wan2.1-14B-I2V-720P models using huggingface-cli:
110
+ ```
111
+ pip install "huggingface_hub[cli]"
112
+ huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P --local-dir ./Wan2.1-I2V-14B-720P
113
+ ```
114
+
115
+ Or download Wan2.1-14B-I2V-720P models using modelscope-cli:
116
+ ```
117
+ pip install modelscope
118
+ modelscope download Wan-AI/Wan2.1-I2V-14B-720P --local_dir ./Wan2.1-I2V-14B-720P
119
+ ```
120
+
121
+ (ii) Download [Animate-X++ checkpoints](https://huggingface.co/Shuaishuai0219/Animate-X-plusplus) and [Dwpose and CLIP checkpoints](https://huggingface.co/Shuaishuai0219/Animate-X) and put all files in `checkpoints` dir
122
+
123
+ (iii) Finally, the model weights will be organized in `./checkpoints/` as follows:
124
+ ```
125
+ ./checkpoints/
126
+ |---- animate-x++.ckpt
127
+ |---- animate-x++_simple.ckpt
128
+ |---- dw-ll_ucoco_384.onnx
129
+ |---- open_clip_pytorch_model.bin
130
+ └---- yolox_l.onnx
131
+ ```
132
+
133
+
134
+ ## &#x1F4A1; Inference
135
+
136
+ The default inputs are a image (.jpg/.png/.jpeg) and a dance video (.mp4/.mov). The default output is a 81-frame video (.mp4) with 832x480 resolution, which will be saved in `./outputs` dir. We give a set of example data in [Animate-X++ example data](https://huggingface.co/Shuaishuai0219/Animate-X-plusplus). Please put it in ./data
137
+
138
+ 1. pre-process the video.
139
+ ```bash
140
+ python process_data.py \
141
+ --source_video_paths data/videos \
142
+ --saved_pose_dir data/saved_pkl \
143
+ --saved_pose data/saved_pose \
144
+ --saved_frame_dir data/saved_frames
145
+ ```
146
+ 2. run Animate-X++. We provide a simple version (recommended):
147
+ - If you have many GPUs for inference, we also support Unified Sequence Parallel (USP), note that python>=3.10 is required for Unified Sequence Parallel (USP):
148
+ ```bash
149
+ pip install xfuser
150
+ CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 examples/inference_480p_usp.py
151
+ # or
152
+ CUDA_VISIBLE_DEVICES=0 torchrun --standalone --nproc_per_node=1 examples/inference_480p_usp.py
153
+ ```
154
+ - Full model of Animate-X++:
155
+ ```bash
156
+ CUDA_VISIBLE_DEVICES=0 torchrun --standalone --nproc_per_node=1 examples/inference_480p.py
157
+ ```
158
+
159
+ **&#10004; Some tips**:
160
+
161
+ > Although Animate-x does not rely on strict pose alignment and we did not perform any manual alignment operations for all the results in the paper, we cannot guarantee that all cases are perfect. Therefore, users can perform handmade pose alignment operations themselves, e.g, applying the overall x/y translation and scaling on the pose skeleton of each frame to align with the position of the subject in the reference image. (put in `data/saved_pose`)
162
+
163
+
164
+ ## &#x1F4E7; Acknowledgement
165
+ Our implementation is based on [UniAnimate-DiT](https://github.com/ali-vilab/UniAnimate-DiT), [MimicMotion](https://github.com/Tencent/MimicMotion), and [MusePose](https://github.com/TMElyralab/MusePose). Thanks for their remarkable contribution and released code! If we missed any open-source projects or related articles, we would like to complement the acknowledgement of this specific work immediately.
166
+
167
+ ## &#x2696; License
168
+ This repository is released under the Apache-2.0 license as found in the [LICENSE](LICENSE) file.
169
+
170
+ ## &#x1F4DA; Citation
171
+ If you find this codebase useful for your research, please use the following entry.
172
+ ```BibTeX
173
+ @article{AnimateX2025,
174
+ title={Animate-X: Universal Character Image Animation with Enhanced Motion Representation},
175
+ author={Tan, Shuai and Gong, Biao and Wang, Xiang and Zhang, Shiwei and Zheng, Dandan and Zheng, Ruobing and Zheng, Kecheng and Chen, Jingdong and Yang, Ming},
176
+ journal={ICLR 2025},
177
+ year={2025}
178
+ }
179
+
180
+ @article{Mimir2025,
181
+ title={Mimir: Improving Video Diffusion Models for Precise Text Understanding},
182
+ author={Tan, Shuai and Gong, Biao and Feng, Yutong and Zheng, Kecheng and Zheng, Dandan and Shi, Shuwei and Shen, Yujun and Chen, Jingdong and Yang, Ming},
183
+ journal={arXiv preprint arXiv:2412.03085},
184
+ year={2025}
185
+ }
186
+ ```