Instructions to use AfterJourney/CoMoVi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use AfterJourney/CoMoVi with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("AfterJourney/CoMoVi", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.5.1/css/all.min.css" integrity="sha512-DTOQO9RWCH3ppGqcWaEA1BIZOC6xxalwEsw9c2QQeAIftl+Vegovlnee1c9QX4TctnWMn13TZye+giMm8e2LwA==" crossorigin="anonymous" referrerpolicy="no-referrer" /> | |
| <h1 align="center">CoMoVi: Co-Generation of 3D Human Motions<br>and Realistic Videos</h1> | |
| <p align="center"> | |
| <a href="https://afterjourney00.github.io/" target="_blank">Chengfeng Zhao</a><sup>1</sup>, | |
| <a href="https://github.com/Samir1110" target="_blank">Jiazhi Shu</a><sup>2</sup>, | |
| <a href="https://knoxzhao.github.io/" target="_blank">Yubo Zhao</a><sup>1</sup>, | |
| <a href="https://scholar.google.com/citations?hl=en&user=nhbSplwAAAAJ" target="_blank">Tianyu Huang</a><sup>3</sup>, | |
| <a href="https://scholar.google.com/citations?hl=en&user=nhbSplwAAAAJ" target="_blank">Jiahao Lu</a><sup>1</sup>, | |
| <br> | |
| <a href="https://scholar.google.com/citations?hl=en&user=nhbSplwAAAAJ" target="_blank">Zekai Gu</a><sup>1</sup>, | |
| <a href="https://scholar.google.com/citations?hl=en&user=nhbSplwAAAAJ" target="_blank">Chengwei Ren</a><sup>1</sup>, | |
| <a href="https://frank-zy-dou.github.io/" target="_blank">Zhiyang Dou</a><sup>4</sup>, | |
| <a href="https://chingswy.github.io/" target="_blank">Qing Shuai</a><sup>5</sup>, | |
| <a href="https://liuyuan-pal.github.io/" target="_blank">Yuan Liu</a><sup>1 <i class="far fa-envelope"></i></sup> | |
| </p> | |
| <p align="center"> | |
| <sup>1</sup>HKUST | |
| <sup>2</sup>SCUT | |
| <sup>3</sup>CUHK | |
| <sup>4</sup>MIT | |
| <sup>5</sup>ZJU | |
| <br> | |
| <i><sup><i class="far fa-envelope"></i></sup> Corresponding author</i> | |
| </p> | |
| <p align="center"> | |
| <a href="https://igl-hkust.github.io/CoMoVi/"><img src='https://img.shields.io/badge/arXiv-Paper-red?logo=arxiv&logoColor=white' alt='arXiv'></a> | |
| <a href='https://igl-hkust.github.io/CoMoVi/'><img src='https://img.shields.io/badge/Project_Page-Website-green?logo=googlechrome&logoColor=white' alt='Project Page'></a> | |
| <a href='https://huggingface.co/datasets/AfterJourney/CoMoVi-50K'><img src='https://img.shields.io/badge/Hugging%20Face-Dataset-yellow?logo=huggingface' alt='Dataset'></a> | |
| </p> | |
| <div align="center"> | |
| <img width="900px" src="./assets/teaser.png"/> | |
| </div> | |
| ## <i class="fa-brands fa-github"></i> [GitHub](https://github.com/IGL-HKUST/CoMoVi) | |
| ## Acknowledgments | |
| Thanks to the following work that we refer to and benefit from: | |
| - [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun): the video generation model training framework; | |
| - [CameraHMR](https://github.com/pixelite1201/CameraHMR/): the excellent SMPL estimation for pseudo labels; | |
| - [Champ](https://github.com/fudan-generative-vision/champ): the data processing pipeline | |
| ## Citation | |
| ```bibtex | |
| @article{zhao2026comovi, | |
| title={CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos}, | |
| author={Zhao, Chengfeng and Shu, Jiazhi and Zhao, Yubo and Huang, Tianyu and Lu, Jiahao and Gu, Zekai and Ren, Chengwei and Dou, Zhiyang and Shuai, Qing and Liu, Yuan}, | |
| journal={arXiv preprint arXiv:2601.10632}, | |
| year={2026} | |
| } | |
| ``` | |