BowenXue
/

Stand-In

Text-to-Video

Model card Files Files and versions

xet

Community

BowenXue commited on Aug 14, 2025

Commit

c5dec9a

verified ·

1 Parent(s): 3aab1a3

Update README.md

Browse files

Files changed (1) hide show

README.md +17 -6

README.md CHANGED Viewed

@@ -23,7 +23,13 @@
 ---
 ## 🔥 News
-* **[2025.08.09]** Released Stand-In v1.0 (153M), the Wan2.1-14B-T2V–adapted weights and inference code are now open-sourced.
 ---
@@ -91,7 +97,7 @@
 ### 1. Environment Setup
 ```bash
 # Clone the project repository
-git clone https://github.com/KBRASK/Stand-In.git
 cd Stand-In
 # Create and activate Conda environment
@@ -126,12 +132,16 @@ This script will download the following models:
 Use the `infer.py` script for standard identity-preserving text-to-video generation.
 ```bash
 python infer.py \
     --prompt "A man sits comfortably at a desk, facing the camera as if talking to a friend or family member on the screen. His gaze is focused and gentle, with a natural smile. The background is his carefully decorated personal space, with photos and a world map on the wall, conveying a sense of intimate and modern communication." \
     --ip_image "test/input/lecun.jpg" \
     --output "test/output/lecun.mp4"
 ```
 ### Inference with Community LoRA
@@ -166,10 +176,10 @@ If you find our work helpful for your research, please consider citing our paper
 ```bibtex
 @article{xue2025standin,
-    title={Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation},
-    author={Xue, Bowen and Yan, Qixin and Wang, Wenjing and Liu, Hao and Li, Chen},
-    journal={arXiv preprint arXiv:2508.xxxxx},
-    year={2025},
 }
 ```
@@ -178,3 +188,4 @@ If you find our work helpful for your research, please consider citing our paper
 ## 📬 Contact Us
 If you have any questions or suggestions, feel free to reach out via [GitHub Issues](https://github.com/WeChatCV/Stand-In/issues) . We look forward to your feedback!

 ---
 ## 🔥 News
+* **[2025.08.13]** Special thanks to @kijai for integrating Stand-In into the custom ComfyUI node **WanVideoWrapper**. However, the implementation differs from the official version, which may affect Stand-In’s performance.
+   To partially mitigate this issue, we have urgently released the official Stand-In preprocessing ComfyUI node:
+  👉 https://github.com/WeChatCV/Stand-In_Preprocessor_ComfyUI
+  If you wish to experience Stand-In within ComfyUI, please use **our official preprocessing node** to replace the one implemented by kijai.
+  For the best results, we recommend waiting for the release of our full **official Stand-In ComfyUI**.
+* **[2025.08.12]** Released Stand-In v1.0 (153M parameters), the Wan2.1-14B-T2V–adapted weights and inference code are now open-sourced.
 ---
 ### 1. Environment Setup
 ```bash
 # Clone the project repository
+git clone https://github.com/WeChatCV/Stand-In.git
 cd Stand-In
 # Create and activate Conda environment
 Use the `infer.py` script for standard identity-preserving text-to-video generation.
 ```bash
 python infer.py \
     --prompt "A man sits comfortably at a desk, facing the camera as if talking to a friend or family member on the screen. His gaze is focused and gentle, with a natural smile. The background is his carefully decorated personal space, with photos and a world map on the wall, conveying a sense of intimate and modern communication." \
     --ip_image "test/input/lecun.jpg" \
     --output "test/output/lecun.mp4"
 ```
+**Prompt Writing Tip:** If you do not wish to alter the subject's facial features, simply use *"a man"* or *"a woman"* without adding extra descriptions of their appearance. Prompts support both Chinese and English input. The prompt is intended for generating frontal, medium-to-close-up videos.
+**Input Image Recommendation:** For best results, use a high-resolution frontal face image. There are no restrictions on resolution or file extension, as our built-in preprocessing pipeline will handle them automatically.
 ### Inference with Community LoRA
 ```bibtex
 @article{xue2025standin,
+      title={Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation},
+      author={Bowen Xue and Qixin Yan and Wenjing Wang and Hao Liu and Chen Li},
+      journal={arXiv preprint arXiv:2508.07901},
+      year={2025},
 }
 ```
 ## 📬 Contact Us
 If you have any questions or suggestions, feel free to reach out via [GitHub Issues](https://github.com/WeChatCV/Stand-In/issues) . We look forward to your feedback!