Update README.md
Browse files
README.md
CHANGED
|
@@ -18,6 +18,21 @@ license: apache-2.0
|
|
| 18 |
<img src="teaser.png" width=95%>
|
| 19 |
<p>
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
## β‘οΈ Quickstart
|
| 22 |
|
| 23 |
### Model Preparation
|
|
@@ -45,7 +60,7 @@ python generate_dreamidv.py \
|
|
| 45 |
--size 832*480 \
|
| 46 |
--ckpt_dir wan2.1-1.3B path \
|
| 47 |
--dreamidv_ckpt dreamidv.pth path \
|
| 48 |
-
--sample_steps
|
| 49 |
--base_seed 42
|
| 50 |
```
|
| 51 |
|
|
@@ -57,21 +72,62 @@ torchrun --nproc_per_node=2 generate_dreamidv.py \
|
|
| 57 |
--size 832*480 \
|
| 58 |
--ckpt_dir wan2.1-1.3B path \
|
| 59 |
--dreamidv_ckpt dreamidv.pth path \
|
| 60 |
-
--sample_steps
|
| 61 |
--dit_fsdp \
|
| 62 |
--t5_fsdp \
|
| 63 |
--ulysses_size 2 \
|
| 64 |
--ring_size 1 \
|
| 65 |
--base_seed 42
|
| 66 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
## π Acknowledgements
|
| 69 |
-
Our work builds upon and is greatly inspired by several outstanding open-source projects, including [Wan2.1](https://github.com/Wan-Video/Wan2.1), [Phantom](https://github.com/Phantom-video/Phantom), [OpenHumanVid](https://github.com/fudan-generative-vision/OpenHumanVid), [Follow-Your-Emoji](https://github.com/mayuelala/FollowYourEmoji). We sincerely thank the authors and contributors of these projects for generously sharing their excellent codes and ideas.
|
| 70 |
|
| 71 |
|
| 72 |
## π§ Contact
|
| 73 |
-
If you have any comments or questions regarding this open-source project, please open a new issue or contact [Xu Guo](https://github.com/Guoxu1233/).
|
| 74 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
|
| 76 |
## β Citation
|
| 77 |
|
|
@@ -87,4 +143,4 @@ If you find our work helpful, please consider citing our paper and leaving valua
|
|
| 87 |
primaryClass={cs.CV},
|
| 88 |
url={https://arxiv.org/abs/2601.01425},
|
| 89 |
}
|
| 90 |
-
```
|
|
|
|
| 18 |
<img src="teaser.png" width=95%>
|
| 19 |
<p>
|
| 20 |
|
| 21 |
+
## π₯ News
|
| 22 |
+
- [01/08/2026] π₯ Thanks HM-RunningHub for supporting [ComfyUI](https://github.com/HM-RunningHub/ComfyUI_RH_DreamID-V)!
|
| 23 |
+
- [01/06/2026] π₯ Our [paper](https://arxiv.org/abs/2601.01425) is released!
|
| 24 |
+
- [01/05/2026] π₯ Our code is released!
|
| 25 |
+
- [12/17/2025] π₯ Our [project](https://guoxu1233.github.io/DreamID-V/) is released!
|
| 26 |
+
- [08/11/2025] π Our image version [DreamID](https://superhero-7.github.io/DreamID/) is accepted by SIGGRAPH Asia 2025!
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
## π‘ Usage Tips
|
| 30 |
+
- **Reference Image Preparation**: Please upload **cropped face images** (recommended resolution: 512x512) as reference. Avoid using full-body photos to ensure optimal identity preservation.
|
| 31 |
+
- **Inference Steps**: For simple scenes, you can reduce the sampling steps to **20** to significantly decrease inference time.
|
| 32 |
+
> *Note*: Our internal model based on Seedance1.0 achieves high quality in under 8 steps. Feel free to experience it at [CapCut](https://www.capcut.cn/).
|
| 33 |
+
- **Best Quality**: For the highest fidelity results, we recommend using a resolution of **1280x720**.
|
| 34 |
+
- **Enhanced Pose Detection**: We have resolved the previous pose detection issue by introducing [**DreamID-V-Wan-1.3B-DWPose**](https://github.com/bytedance/DreamID-V/tree/main?tab=readme-ov-file#dreamid-v-wan-13b-dwpose). This significantly improves stability and robustness in pose extraction.
|
| 35 |
+
|
| 36 |
## β‘οΈ Quickstart
|
| 37 |
|
| 38 |
### Model Preparation
|
|
|
|
| 60 |
--size 832*480 \
|
| 61 |
--ckpt_dir wan2.1-1.3B path \
|
| 62 |
--dreamidv_ckpt dreamidv.pth path \
|
| 63 |
+
--sample_steps 20 \
|
| 64 |
--base_seed 42
|
| 65 |
```
|
| 66 |
|
|
|
|
| 72 |
--size 832*480 \
|
| 73 |
--ckpt_dir wan2.1-1.3B path \
|
| 74 |
--dreamidv_ckpt dreamidv.pth path \
|
| 75 |
+
--sample_steps 20 \
|
| 76 |
--dit_fsdp \
|
| 77 |
--t5_fsdp \
|
| 78 |
--ulysses_size 2 \
|
| 79 |
--ring_size 1 \
|
| 80 |
--base_seed 42
|
| 81 |
```
|
| 82 |
+
#### DreamID-V-Wan-1.3B-DWPose
|
| 83 |
+
Please ensure the pose estimation models are placed in the correct directory as follows:
|
| 84 |
+
```text
|
| 85 |
+
DreamID-V/
|
| 86 |
+
βββ pose/
|
| 87 |
+
βββ models/
|
| 88 |
+
βββ dw-ll_ucoco_384.onnx
|
| 89 |
+
βββ yolox_l.onnx
|
| 90 |
+
```
|
| 91 |
+
- Single-GPU inference
|
| 92 |
+
|
| 93 |
+
``` sh
|
| 94 |
+
python generate_dreamidv_dwpose.py \
|
| 95 |
+
--size 832*480 \
|
| 96 |
+
--ckpt_dir wan2.1-1.3B path \
|
| 97 |
+
--dreamidv_ckpt dreamidv.pth path \
|
| 98 |
+
--sample_steps 20 \
|
| 99 |
+
--base_seed 42
|
| 100 |
+
```
|
| 101 |
+
- Multi-GPU inference using FSDP + xDiT USP
|
| 102 |
+
|
| 103 |
+
``` sh
|
| 104 |
+
pip install "xfuser>=0.4.1"
|
| 105 |
+
torchrun --nproc_per_node=2 generate_dreamidv_dwpose.py \
|
| 106 |
+
--size 832*480 \
|
| 107 |
+
--ckpt_dir wan2.1-1.3B path \
|
| 108 |
+
--dreamidv_ckpt dreamidv.pth path \
|
| 109 |
+
--sample_steps 20 \
|
| 110 |
+
--dit_fsdp \
|
| 111 |
+
--t5_fsdp \
|
| 112 |
+
--ulysses_size 2 \
|
| 113 |
+
--ring_size 1 \
|
| 114 |
+
--base_seed 42
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
|
| 118 |
## π Acknowledgements
|
| 119 |
+
Our work builds upon and is greatly inspired by several outstanding open-source projects, including [Wan2.1](https://github.com/Wan-Video/Wan2.1), [Phantom](https://github.com/Phantom-video/Phantom), [OpenHumanVid](https://github.com/fudan-generative-vision/OpenHumanVid), [Follow-Your-Emoji](https://github.com/mayuelala/FollowYourEmoji), [DWPose](https://github.com/IDEA-Research/DWPose). We sincerely thank the authors and contributors of these projects for generously sharing their excellent codes and ideas.
|
| 120 |
|
| 121 |
|
| 122 |
## π§ Contact
|
| 123 |
+
If you have any comments or questions regarding this open-source project, please open a new issue or contact [Xu Guo](https://github.com/Guoxu1233/) and [Fulong Ye](https://github.com/superhero-7).
|
| 124 |
|
| 125 |
+
## β οΈ Ethics Statement
|
| 126 |
+
This project, **DreamID-V**, is intended for **academic research and technical demonstration purposes only**.
|
| 127 |
+
- **Prohibited Use**: Users are strictly prohibited from using this codebase to generate content that is illegal, defamatory, pornographic, harmful, or infringes upon the privacy and rights of others.
|
| 128 |
+
- **Responsibility**: Users bear full responsibility for the content they generate. The authors and contributors of this project assume no liability for any misuse or consequences arising from the use of this software.
|
| 129 |
+
- **AI Labeling**: We strongly recommend marking generated videos as "AI-Generated" to prevent misinformation.
|
| 130 |
+
By using this software, you agree to adhere to these guidelines and applicable local laws.
|
| 131 |
|
| 132 |
## β Citation
|
| 133 |
|
|
|
|
| 143 |
primaryClass={cs.CV},
|
| 144 |
url={https://arxiv.org/abs/2601.01425},
|
| 145 |
}
|
| 146 |
+
```
|