| --- |
| license: mit |
| --- |
| |
|
|
| <div align="center"> |
| |
| <h1>ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions</h1> |
|
|
| [](https://github.com/UknowSth/ShotDirector)Β βΒ [](https://arxiv.org/abs/2512.10286)Β βΒ [](https://github.com/UknowSth/ShotDirector)Β βΒ Β |
|
|
| |
| <p><a href="https://scholar.google.com/citations?hl=zh-CN&user=TbZZSVgAAAAJ">Xiaoxue Wu</a><sup>1,2*</sup>, |
| <a href="https://scholar.google.com/citations?user=3fWSC8YAAAAJ">Xinyuan Chen</a><sup>2†</sup>, |
| <a href="https://wyhsirius.github.io/">Yaohui Wang</a><sup>2†</sup>, |
| <a href="https://scholar.google.com.hk/citations?user=gFtI-8QAAAAJ&hl=zh-CN">Yu Qiao</a><sup>2†</sup>, |
| </p> |
| |
|
|
| <span class="author-block"><sup>1</sup>Fudan University</span> |
| <span class="author-block"><sup>2</sup>Shanghai Artificial Intelligence Laboratory</span> |
|
|
|
|
| <span class="author-block"><sup>*</sup>Work done during internship at Shanghai AI Laboratory</span> <span class="author-block"><sup>†</sup>Corresponding author</span> |
| |
| </div> |
| |
| ## π₯ Installation |
| 1. Clone the Repository |
| ``` |
| git clone https://github.com/UknowSth/ShotDirector.git |
| cd ShotDirector |
| ``` |
| 2. Set up Environment |
| ``` |
| conda create -n shotdirector python==3.11.9 |
| conda activate shotdirector |
| |
| pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118 |
| pip install -r requirements.txt |
| ``` |
| |
| ## π€ Checkpoint |
| |
| ### CineTrans-DiT |
| Download the weights of [Wan2.1-T2V-1.3B](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B) and the weights required for Shotdirector. Place them in the `.ckpt/` folder as shown in the following diagram. |
| |
| ``` |
| ckpt/ |
| βββ Wan2.1/Wan2.1-T2V-1.3B/ |
| β βββ config.json |
| β βββ diffusion_pytorch_model.safetensors |
| β βββ google/ |
| β βββ models_t5_umt5-xxl-enc-bf16.pth |
| β βββ Wan2.1_VAE.pth |
| βββ encoder.pt |
| βββ model.pt |
| βββ trans.pt |
| ``` |
| |
| For more inference details, please refer to our [GitHub repository](https://github.com/UknowSth/ShotDirector). |
| |
| ## π BiTeX |
| If you find [ShotDirector](https://github.com/UknowSth/ShotDirector.git) useful for your research and applications, please cite using this BibTeX: |
| ``` |
| @misc{wu2025shotdirectordirectoriallycontrollablemultishot, |
| title={ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions}, |
| author={Xiaoxue Wu and Xinyuan Chen and Yaohui Wang and Yu Qiao}, |
| year={2025}, |
| eprint={2512.10286}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CV}, |
| url={https://arxiv.org/abs/2512.10286}, |
| } |
| ``` |