NumlockUknowSth
/

CineTrans-DiT

Model card Files Files and versions

CineTrans-DiT / README.md

NumlockUknowSth's picture

NumlockUknowSth

Update README.md

f693479 verified 1 day ago

|

history blame contribute delete

3.33 kB

	---
	license: mit
	datasets:
	- NumlockUknowSth/Cine250K
	language:
	- en
	base_model:
	- Wan-AI/Wan2.1-T2V-1.3B
	pipeline_tag: text-to-video
	tags:
	- multi-shot
	---

	<div align="center">

	<h1>CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models</h1>

	[![](https://img.shields.io/static/v1?label=CineTrans&message=Project&color=purple)](https://uknowsth.github.io/CineTrans/) [![](https://img.shields.io/static/v1?label=Paper&message=Arxiv&color=red&logo=arxiv)](https://arxiv.org/abs/2508.11484) [![](https://img.shields.io/static/v1?label=Code&message=Github&color=blue&logo=github)](https://github.com/Vchitect/CineTrans) [![](https://img.shields.io/static/v1?label=Dataset&message=HuggingFace&color=yellow&logo=huggingface)](https://huggingface.co/datasets/NumlockUknowSth/Cine250K)


	<p><a href="https://scholar.google.com/citations?hl=zh-CN&user=TbZZSVgAAAAJ">Xiaoxue Wu</a><sup>1,2*</sup>,
	<a href="https://scholar.google.com/citations?user=0gY2o7MAAAAJ&hl=zh-CN" target="_blank">Bingjie Gao</a><sup>2,3</sup>,
	<a href="https://scholar.google.com.hk/citations?user=gFtI-8QAAAAJ&hl=zh-CN">Yu Qiao</a><sup>2&dagger;</sup>,
	<a href="https://wyhsirius.github.io/">Yaohui Wang</a><sup>2&dagger;</sup>,
	<a href="https://scholar.google.com/citations?user=3fWSC8YAAAAJ">Xinyuan Chen</a><sup>2&dagger;</sup></p>


	<span class="author-block"><sup>1</sup>Fudan University</span>
	<span class="author-block"><sup>2</sup>Shanghai Artificial Intelligence Laboratory</span>
	<span class="author-block"><sup>3</sup>Shanghai Jiao Tong University</span>


	<span class="author-block"><sup>*</sup>Work done during internship at Shanghai AI Laboratory</span> <span class="author-block"><sup>&dagger;</sup>Corresponding author</span>

	</div>

	## 📥 Installation
	1. Clone the Repository
	```
	git clone https://github.com/UknowSth/CineTrans.git
	cd CineTrans
	```
	2. Set up Environment
	```
	conda create -n cinetrans python==3.11.9
	conda activate cinetrans

	pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118
	pip install -r requirements.txt
	```

	## 🤗 Checkpoint

	### CineTrans-DiT
	Download the weights of [Wan2.1-T2V-1.3B](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B/tree/main) and [lora weights](https://huggingface.co/NumlockUknowSth/CineTrans-DiT/tree/main). Place them as:
	```
	Wan2.1-T2V-1.3B/ # original weights
	│── google/
	│ └── umt5-xxl/
	│── config.json
	│── diffusion_pytorch_model.safetensors
	│── models_t5_umt5-xxl-enc-bf16.pth
	│── Wan2.1_VAE.pth
	ckpt/
	└── weights.pt # lora weights
	```

	For more inference details, please refer to our [GitHub repository](https://github.com/Vchitect/CineTrans).

	## 📑 BiTeX
	If you find [CineTrans](https://github.com/Vchitect/CineTrans.git) useful for your research and applications, please cite using this BibTeX:
	```
	@misc{wu2025cinetranslearninggeneratevideos,
	title={CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models},
	author={Xiaoxue Wu and Bingjie Gao and Yu Qiao and Yaohui Wang and Xinyuan Chen},
	year={2025},
	eprint={2508.11484},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2508.11484},
	}
	```