NumlockUknowSth
/

CineTrans-Unet

+---
+license: mit
+datasets:
+- NumlockUknowSth/Cine250K
+language:
+- en
+pipeline_tag: text-to-video
+tags:
+- multi-shot
+---
+<div align="center">
+<h1>CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models</h1>
+[![](https://img.shields.io/static/v1?label=CineTrans&message=Project&color=purple)](https://uknowsth.github.io/CineTrans/)   [![](https://img.shields.io/static/v1?label=Paper&message=Arxiv&color=red&logo=arxiv)](https://arxiv.org/abs/2508.11484)   [![](https://img.shields.io/static/v1?label=Code&message=Github&color=blue&logo=github)](https://github.com/Vchitect/CineTrans)   [![](https://img.shields.io/static/v1?label=Dataset&message=HuggingFace&color=yellow&logo=huggingface)](https://huggingface.co/datasets/NumlockUknowSth/Cine250K)
+<p><a href="https://scholar.google.com/citations?hl=zh-CN&user=TbZZSVgAAAAJ">Xiaoxue Wu</a><sup>1,2*</sup>,
+<a href="https://scholar.google.com/citations?user=0gY2o7MAAAAJ&amp;hl=zh-CN" target="_blank">Bingjie Gao</a><sup>2,3</sup>,
+<a href="https://scholar.google.com.hk/citations?user=gFtI-8QAAAAJ&amp;hl=zh-CN">Yu Qiao</a><sup>2&dagger;</sup>,
+<a href="https://wyhsirius.github.io/">Yaohui Wang</a><sup>2&dagger;</sup>,
+<a href="https://scholar.google.com/citations?user=3fWSC8YAAAAJ">Xinyuan Chen</a><sup>2&dagger;</sup></p>
+<span class="author-block"><sup>1</sup>Fudan University</span>
+<span class="author-block"><sup>2</sup>Shanghai Artificial Intelligence Laboratory</span>
+<span class="author-block"><sup>3</sup>Shanghai Jiao Tong University</span>
+<span class="author-block"><sup>*</sup>Work done during internship at Shanghai AI Laboratory</span> <span class="author-block"><sup>&dagger;</sup>Corresponding author</span>
+</div>
+## 📥 Installation
+1. Clone the Repository
+```
+git clone https://github.com/UknowSth/CineTrans.git
+cd CineTrans
+```
+2. Set up Environment
+```
+conda create -n cinetrans python==3.11.9
+conda activate cinetrans
+pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118
+pip install -r requirements.txt
+```
+## 🤗 Checkpoint
+### CineTrans-Unet
+Download the required [model weights](https://huggingface.co/NumlockUknowSth/CineTrans-Unet/tree/main) and place them in the `ckpt/` directory.
+```
+ckpt/
+│── stable-diffusion-v1-4/
+│   ├── scheduler/
+│   ├── text_encoder/
+│   ├── tokenizer/
+│   │── unet/
+│   └── vae_temporal_decoder/
+│── checkpoint.pt
+│── longclip-L.pt
+```
+For more inference details, please refer to the [GitHub repository](https://github.com/Vchitect/CineTrans).
+---
+## 📑 BiTeX
+If you find [CineTrans](https://github.com/Vchitect/CineTrans.git) useful for your research and applications, please cite using this BibTeX:
+```
+@misc{wu2025cinetranslearninggeneratevideos,
+      title={CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models},
+      author={Xiaoxue Wu and Bingjie Gao and Yu Qiao and Yaohui Wang and Xinyuan Chen},
+      year={2025},
+      eprint={2508.11484},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2508.11484},
+}
+```