UFVideo-7B / README.md
Hevven's picture
Update README.md
d5f0b0e verified
metadata
license: mit
base_model:
  - DAMO-NLP-SG/VideoRefer-7B

UFVideo-7B

This repository provides the complete code and datasets for UFVideo, a Video LLM that flexibly unifies general question answering, video object referring, video segmentation, and temporal video grounding to achieve multi-grained video understanding.

πŸ“₯ Installation

Environment

First, clone the repository and navigate to the project folder.

git clone https://github.com/Heven-Pan/UFVideo
cd UFVideo

Then, install the requirement packages.

conda create -n UFVideo python=3.10.14
conda activate UFVideo

# our cuda version is 'cu124'
pip install -r requirements.txt
# other versions have no been verified
pip install flash-attn --no-build-isolation

For evaluation and training, please refer to the UFVideo repository.

πŸ“‘ Citation

Please kindly cite our paper if you find this project helpful.

@article{pan2025ufvideo,
  title={UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models},
  author={Pan, Hewen and Wei, Cong and Liang, Dashuang and Huang, Zepeng and Gao, Pengfei and Zhou, Ziqi and Xue, Lulu and Yan, Pengfei and Wei, Xiaoming and Li, Minghui and others},
  journal={arXiv preprint arXiv:2512.11336},
  year={2025}
}