--- pipeline_tag: text-to-video license: unknown --- # LongVie 2: Multimodal Controllable Ultra-Long Video World Model LongVie 2 is a multimodal controllable world model for generating ultra-long videos with depth and pointmap control signals, as presented in the paper [LongVie 2: Multimodal Controllable Ultra-Long Video World Model](https://huggingface.co/papers/2512.13604). It is an end-to-end autoregressive framework trained to enhance controllability, long-term visual quality, and temporal consistency. - 📝 [Paper on Hugging Face](https://huggingface.co/papers/2512.13604) - 🌐 [Project Page](https://vchitect.github.io/LongVie2-project/) - 💻 [GitHub Repository](https://github.com/Vchitect/LongVie) - 🚀 [HF Demo](https://huggingface.co/spaces/Vision-CAIR/LongVU)
LongVie 2 Demo GIF
## 🚀 Quick Start ### Installation To get started with LongVie 2, follow the installation steps from the GitHub repository: ```bash conda create -n longvie python=3.10 -y conda activate longvie conda install psutil pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121 python -m pip install ninja python -m pip install git+https://github.com/Dao-AILab/flash-attention.git@v2.7.2.post1 cd LongVie pip install -e . ``` ### Download Weights 1. Download the base model `Wan2.1-I2V-14B-480P`: ```bash python download_wan2.1.py ``` 2. Download the [LongVie2 weights](https://huggingface.co/Vchitect/LongVie2) and place them in `./model/LongVie/` ### Inference Generate a 5s video clip (~8-9 mins on a single A100 GPU) using the following command: ```bash bash sample_longvideo.sh ``` ## 📄 Citation If you find this work useful, please consider citing: ```bibtex @misc{gao2025longvie2, title={LongVie 2: Multimodal Controllable Ultra-Long Video World Model}, author={Jianxiong Gao and Zhaoxi Chen and Xian Liu and Junhao Zhuang and Chengming Xu and Jianfeng Feng and Yu Qiao and Yanwei Fu and Chenyang Si and Ziwei Liu}, year={2025}, eprint={2512.13604}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2512.13604}, } ```