| license: mit | |
| task_categories: | |
| - image-to-video | |
| language: | |
| - en | |
| tags: | |
| - diffusion | |
| - generation | |
| - human | |
| - animation | |
| size_categories: | |
| - 10K<n<100K | |
| ## Data for Human Image Animation | |
| <img src="https://github.com/user-attachments/assets/01174ec4-c076-4947-966c-01d511d0383e"> | |
| ## ๐ Introduction | |
| <b>TL; DR: With the rapid developments in generative models, including the diffusion-based or the flow-based models, the human-centric tasks, like pose-driven human image animation, audio-driven action generation, diffusion-based pose estimation, human optical estimation, etc., have attracted a lot of attention from lots of works. | |
| We pay attention to the quality of the training data of human data for these tasks. However, due to the lack of high-quality datasets, especially for the human image animation, we find it is hard to collect videos from existing public datasets, while these videos have these characteristics: | |
| 1. High-resolution: the resolution of the vertical video is larger than 1080 * 576. | |
| 2. High-dynamic: the video is vivid and suitable to learn human motions. | |
| 3. Dancing-style: In this stage, we focus on the human animation task and mainly collect videos like TikTok styles. | |
| ## โ๏ธ What we do | |
| We collect a large number of videos from the internet. After filtering low-quality, limited motion, and bad frames, we get 25,000 videos in this repo. Now we provide a visualization to these data and the corresponding pose data, you can check each training video in our work. | |
| Notice: we do not allow any commercial usage of these videos and you must delete them within 24 hours after downloading. | |
| Tips: If you find that your data is being infringed upon, please contact us immediately to request its removal. | |
| ## ๐ Citation | |
| If you find this guidance helpful, please consider citing: | |
| ``` | |
| @article{zhao2025dynamictrl, | |
| title={DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation}, | |
| author={Haoyu, Zhao and Zhongang, Qi and Cong, Wang and Qingping, Zheng and Guansong, Lu and Fei, Chen and Hang, Xu and Zuxuan, Wu}, | |
| year={2025}, | |
| journal={arXiv:2503.21246}, | |
| } | |
| ``` | |
Xet Storage Details
- Size:
- 2.17 kB
- Xet hash:
- 94963dff84ca07e091dcae0919abb48e5c52935f387d122c4a0ff7967c667df1
ยท
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.