--- license: mit datasets: - NumlockUknowSth/Cine250K language: - en base_model: - Wan-AI/Wan2.1-T2V-1.3B pipeline_tag: text-to-video tags: - multi-shot ---

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models

[![](https://img.shields.io/static/v1?label=CineTrans&message=Project&color=purple)](https://uknowsth.github.io/CineTrans/)   [![](https://img.shields.io/static/v1?label=Paper&message=Arxiv&color=red&logo=arxiv)](https://arxiv.org/abs/2508.11484)   [![](https://img.shields.io/static/v1?label=Code&message=Github&color=blue&logo=github)](https://github.com/Vchitect/CineTrans)   [![](https://img.shields.io/static/v1?label=Dataset&message=HuggingFace&color=yellow&logo=huggingface)](https://huggingface.co/datasets/NumlockUknowSth/Cine250K)   

Xiaoxue Wu1,2*, Bingjie Gao2,3, Yu Qiao2†, Yaohui Wang2†, Xinyuan Chen2†

1Fudan University 2Shanghai Artificial Intelligence Laboratory 3Shanghai Jiao Tong University *Work done during internship at Shanghai AI Laboratory Corresponding author
## 📥 Installation 1. Clone the Repository ``` git clone https://github.com/UknowSth/CineTrans.git cd CineTrans ``` 2. Set up Environment ``` conda create -n cinetrans python==3.11.9 conda activate cinetrans pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118 pip install -r requirements.txt ``` ## 🤗 Checkpoint ### CineTrans-DiT Download the weights of [Wan2.1-T2V-1.3B](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B/tree/main) and [lora weights](https://huggingface.co/NumlockUknowSth/CineTrans-DiT/tree/main). Place them as: ``` Wan2.1-T2V-1.3B/ # original weights │── google/ │ └── umt5-xxl/ │── config.json │── diffusion_pytorch_model.safetensors │── models_t5_umt5-xxl-enc-bf16.pth │── Wan2.1_VAE.pth ckpt/ └── weights.pt # lora weights ``` For more inference details, please refer to our [GitHub repository](https://github.com/Vchitect/CineTrans). ## 📑 BiTeX If you find [CineTrans](https://github.com/Vchitect/CineTrans.git) useful for your research and applications, please cite using this BibTeX: ``` @misc{wu2025cinetranslearninggeneratevideos, title={CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models}, author={Xiaoxue Wu and Bingjie Gao and Yu Qiao and Yaohui Wang and Xinyuan Chen}, year={2025}, eprint={2508.11484}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2508.11484}, } ```