Update README.md
Browse files
README.md
CHANGED
|
@@ -23,7 +23,13 @@
|
|
| 23 |
---
|
| 24 |
|
| 25 |
## 🔥 News
|
| 26 |
-
* **[2025.08.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
---
|
| 29 |
|
|
@@ -91,7 +97,7 @@
|
|
| 91 |
### 1. Environment Setup
|
| 92 |
```bash
|
| 93 |
# Clone the project repository
|
| 94 |
-
git clone https://github.com/
|
| 95 |
cd Stand-In
|
| 96 |
|
| 97 |
# Create and activate Conda environment
|
|
@@ -126,12 +132,16 @@ This script will download the following models:
|
|
| 126 |
|
| 127 |
Use the `infer.py` script for standard identity-preserving text-to-video generation.
|
| 128 |
|
|
|
|
| 129 |
```bash
|
| 130 |
python infer.py \
|
| 131 |
--prompt "A man sits comfortably at a desk, facing the camera as if talking to a friend or family member on the screen. His gaze is focused and gentle, with a natural smile. The background is his carefully decorated personal space, with photos and a world map on the wall, conveying a sense of intimate and modern communication." \
|
| 132 |
--ip_image "test/input/lecun.jpg" \
|
| 133 |
--output "test/output/lecun.mp4"
|
| 134 |
```
|
|
|
|
|
|
|
|
|
|
| 135 |
|
| 136 |
### Inference with Community LoRA
|
| 137 |
|
|
@@ -166,10 +176,10 @@ If you find our work helpful for your research, please consider citing our paper
|
|
| 166 |
|
| 167 |
```bibtex
|
| 168 |
@article{xue2025standin,
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
| 173 |
}
|
| 174 |
```
|
| 175 |
|
|
@@ -178,3 +188,4 @@ If you find our work helpful for your research, please consider citing our paper
|
|
| 178 |
## 📬 Contact Us
|
| 179 |
|
| 180 |
If you have any questions or suggestions, feel free to reach out via [GitHub Issues](https://github.com/WeChatCV/Stand-In/issues) . We look forward to your feedback!
|
|
|
|
|
|
| 23 |
---
|
| 24 |
|
| 25 |
## 🔥 News
|
| 26 |
+
* **[2025.08.13]** Special thanks to @kijai for integrating Stand-In into the custom ComfyUI node **WanVideoWrapper**. However, the implementation differs from the official version, which may affect Stand-In’s performance.
|
| 27 |
+
To partially mitigate this issue, we have urgently released the official Stand-In preprocessing ComfyUI node:
|
| 28 |
+
👉 https://github.com/WeChatCV/Stand-In_Preprocessor_ComfyUI
|
| 29 |
+
If you wish to experience Stand-In within ComfyUI, please use **our official preprocessing node** to replace the one implemented by kijai.
|
| 30 |
+
For the best results, we recommend waiting for the release of our full **official Stand-In ComfyUI**.
|
| 31 |
+
|
| 32 |
+
* **[2025.08.12]** Released Stand-In v1.0 (153M parameters), the Wan2.1-14B-T2V–adapted weights and inference code are now open-sourced.
|
| 33 |
|
| 34 |
---
|
| 35 |
|
|
|
|
| 97 |
### 1. Environment Setup
|
| 98 |
```bash
|
| 99 |
# Clone the project repository
|
| 100 |
+
git clone https://github.com/WeChatCV/Stand-In.git
|
| 101 |
cd Stand-In
|
| 102 |
|
| 103 |
# Create and activate Conda environment
|
|
|
|
| 132 |
|
| 133 |
Use the `infer.py` script for standard identity-preserving text-to-video generation.
|
| 134 |
|
| 135 |
+
|
| 136 |
```bash
|
| 137 |
python infer.py \
|
| 138 |
--prompt "A man sits comfortably at a desk, facing the camera as if talking to a friend or family member on the screen. His gaze is focused and gentle, with a natural smile. The background is his carefully decorated personal space, with photos and a world map on the wall, conveying a sense of intimate and modern communication." \
|
| 139 |
--ip_image "test/input/lecun.jpg" \
|
| 140 |
--output "test/output/lecun.mp4"
|
| 141 |
```
|
| 142 |
+
**Prompt Writing Tip:** If you do not wish to alter the subject's facial features, simply use *"a man"* or *"a woman"* without adding extra descriptions of their appearance. Prompts support both Chinese and English input. The prompt is intended for generating frontal, medium-to-close-up videos.
|
| 143 |
+
|
| 144 |
+
**Input Image Recommendation:** For best results, use a high-resolution frontal face image. There are no restrictions on resolution or file extension, as our built-in preprocessing pipeline will handle them automatically.
|
| 145 |
|
| 146 |
### Inference with Community LoRA
|
| 147 |
|
|
|
|
| 176 |
|
| 177 |
```bibtex
|
| 178 |
@article{xue2025standin,
|
| 179 |
+
title={Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation},
|
| 180 |
+
author={Bowen Xue and Qixin Yan and Wenjing Wang and Hao Liu and Chen Li},
|
| 181 |
+
journal={arXiv preprint arXiv:2508.07901},
|
| 182 |
+
year={2025},
|
| 183 |
}
|
| 184 |
```
|
| 185 |
|
|
|
|
| 188 |
## 📬 Contact Us
|
| 189 |
|
| 190 |
If you have any questions or suggestions, feel free to reach out via [GitHub Issues](https://github.com/WeChatCV/Stand-In/issues) . We look forward to your feedback!
|
| 191 |
+
|