metadata
license: apache-2.0
language:
- en
base_model:
- tencent/HunyuanVideo
- Qwen/Qwen2.5-VL-7B-Instruct
pipeline_tag: any-to-any
tags:
- video
UniVideo: Unified Understanding, Generation, and Editing for Videos
Cong Wei*,1,2 โ Quande Liuโ ,2 โ Zixuan Ye2 โ Qiulin Wang2 โ Xintao Wang2
Pengfei Wan2 โ Kun Gai2 โ Wenhu Chenโ ,1
1University of Waterloo
2Kling Team, Kuaishou Technology
*Work done during an internship at Kling Team, Kuaishou Technology
โ Corresponding author
๐News
- [2026-01-07]: Released Code and Model.
- [2025-10-09]: Released Arxiv Preprint and the Project Page
How to use
- Please refer to ๐ GitHub for usage.
Acknowledgement
- HunyuanVideo: the base video generation model used in this work. Thanks to the authors for their excellent contribution.
- Qwen2.5-VL: the base vlm model used in this work. Thanks to the authors for their excellent contribution.
- MetaQueries: we adopt their query implementation. Thanks to the authors for their excellent contribution.
๐ Citation
If you find UniVideo useful for your research and applications, please cite using this BibTeX:
@article{wei2025univideo,
title={Univideo: Unified understanding, generation, and editing for videos},
author={Wei, Cong and Liu, Quande and Ye, Zixuan and Wang, Qiulin and Wang, Xintao and Wan, Pengfei and Gai, Kun and Chen, Wenhu},
journal={arXiv preprint arXiv:2510.08377},
year={2025}
}