metadata
base_model:
- microsoft/Phi-3-vision-128k-instruct
license: mit
pipeline_tag: video-text-to-text
Model Card for VideoChat-Online
This modelcard aims to give the model info of 'Online Video Understanding: OVBench and VideoChat-Online'.
Model Details
🛠Usage
Check the Demo.
📃Model Sources
- Repository: VideoChat-Online
- Paper: 2501.00584
✏️Citation
If you find this work useful for your research, please consider citing VideoChatOnline. Your acknowledgement would greatly help us in continuing to contribute resources to the research community.
@article{huang2024online,
title={Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method},
author={Huang, Zhenpeng and Li, Xinhao and Li, Jiaqi and Wang, Jing and Zeng, Xiangyu and Liang, Cheng and Wu, Tao and Chen, Xi and Li, Liang and Wang, Limin},
journal={arXiv preprint arXiv:2501.00584},
year={2024}
}