Vchitect
/

LongVie2

+---
+pipeline_tag: text-to-video
+license: unknown
+---
+# LongVie 2: Multimodal Controllable Ultra-Long Video World Model
+LongVie 2 is a multimodal controllable world model for generating ultra-long videos with depth and pointmap control signals, as presented in the paper [LongVie 2: Multimodal Controllable Ultra-Long Video World Model](https://huggingface.co/papers/2512.13604). It is an end-to-end autoregressive framework trained to enhance controllability, long-term visual quality, and temporal consistency.
+- 📝 [Paper on Hugging Face](https://huggingface.co/papers/2512.13604)
+- 🌐 [Project Page](https://vchitect.github.io/LongVie2-project/)
+- 💻 [GitHub Repository](https://github.com/Vchitect/LongVie)
+- 🚀 [HF Demo](https://huggingface.co/spaces/Vision-CAIR/LongVU)
+<div align="center">
+    <img src="https://longvu.s3.amazonaws.com/assets/demo.gif" alt="LongVie 2 Demo GIF" style="width: 100%; max-width: 650px;">
+</div>
+## 🚀 Quick Start
+### Installation
+To get started with LongVie 2, follow the installation steps from the GitHub repository:
+```bash
+conda create -n longvie python=3.10 -y
+conda activate longvie
+conda install psutil
+pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
+python -m pip install ninja
+python -m pip install git+https://github.com/Dao-AILab/flash-attention.git@v2.7.2.post1
+cd LongVie
+pip install -e .
+```
+### Download Weights
+1. Download the base model `Wan2.1-I2V-14B-480P`:
+```bash
+python download_wan2.1.py
+```
+2. Download the [LongVie2 weights](https://huggingface.co/Vchitect/LongVie2) and place them in `./model/LongVie/`
+### Inference
+Generate a 5s video clip (~8-9 mins on a single A100 GPU) using the following command:
+```bash
+bash sample_longvideo.sh
+```
+## 📄 Citation
+If you find this work useful, please consider citing:
+```bibtex
+@misc{gao2025longvie2,
+  title={LongVie 2: Multimodal Controllable Ultra-Long Video World Model},
+  author={Jianxiong Gao and Zhaoxi Chen and Xian Liu and Junhao Zhuang and Chengming Xu and Jianfeng Feng and Yu Qiao and Yanwei Fu and Chenyang Si and Ziwei Liu},
+  year={2025},
+  eprint={2512.13604},
+  archivePrefix={arXiv},
+  primaryClass={cs.CV},
+  url={https://arxiv.org/abs/2512.13604},
+}
+```