Proteus-ID

Proteus-ID: ID-Consistent and Motion-Coherent Video Customization

[![Project Website](https://img.shields.io/badge/Project-Website-blue)](https://grenoble-zhang.github.io/Proteus-ID/)  [![arXiv](https://img.shields.io/badge/arXiv-2506.23729-b31b1b.svg)](https://arxiv.org/abs/2506.23729) 
Authors: [Guiyu Zhang](https://grenoble-zhang.github.io/)1, [Chen Shi](https://scholar.google.com.hk/citations?user=o-K_AoYAAAAJ&hl=en)1, Zijian Jiang1, Xunzhi Xiang2, Jingjing Qian1, [Shaoshuai Shi](https://shishaoshuai.com/)3, [Li Jiang†](https://llijiang.github.io/)1 1 The Chinese University of Hong Kong, Shenzhen 2 Nanjing University  3 Voyager Research, Didi Chuxing ## TODO - [x] Release arXiv technique report - [x] Release full codes - [ ] Release dataset (coming soon) ## πŸ› οΈ Requirements and Installation ### Environment ```bash # 0. Clone the repo git clone --depth=1 https://github.com/grenoble-zhang/Proteus-ID.git cd /nfs/dataset-ofs-voyager-research/guiyuzhang/Opensource/code/Proteus-ID-main # 1. Create conda environment conda create -n proteusid python=3.11.0 conda activate proteusid # 3. Install PyTorch and other dependencies # CUDA 12.6 pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126 # 4. Install pip dependencies pip install -r requirements.txt ``` ### Download Model ```bash cd util python download_weights.py python down_raft.py ``` Once ready, the weights will be organized in this format: ``` πŸ”¦ ckpts/ β”œβ”€β”€ πŸ“‚ face_encoder/ β”œβ”€β”€ πŸ“‚ scheduler/ β”œβ”€β”€ πŸ“‚ text_encoder/ β”œβ”€β”€ πŸ“‚ tokenizer/ β”œβ”€β”€ πŸ“‚ transformer/ β”œβ”€β”€ πŸ“‚ vae/ β”œβ”€β”€ πŸ“„ configuration.json β”œβ”€β”€ πŸ“„ model_index.json ``` ## πŸ‹οΈ Training ```bash # For single rank bash train_single_rank.sh # For multi rank bash train_multi_rank.sh ``` ## πŸ„οΈ Inference ```bash python inference.py --img_file_path assets/example_images/1.png --json_file_path assets/example_images/1.json ``` ## BibTeX If you find our work useful in your research, please consider citing our paper: ```bibtex @article{zhang2025proteus, title={Proteus-ID: ID-Consistent and Motion-Coherent Video Customization}, author={Zhang, Guiyu and Shi, Chen and Jiang, Zijian and Xiang, Xunzhi and Qian, Jingjing and Shi, Shaoshuai and Jiang, Li}, journal={arXiv preprint arXiv:2506.23729}, year={2025} } ``` ## Acknowledgement Thansk for these excellent opensource works and models: [CogVideoX](https://github.com/THUDM/CogVideo); [ConsisID](https://github.com/PKU-YuanGroup/ConsisID); [diffusers](https://github.com/huggingface/diffusers).