This repository contains the data for the paper PAVE: Patching and Adapting Video Large Language Models.

Citation [optional]

arxiv.org/abs/2503.19794

BibTeX:

@misc{liu2025pavepatchingadaptingvideo,
      title={PAVE: Patching and Adapting Video Large Language Models}, 
      author={Zhuoming Liu and Yiquan Li and Khoi Duc Nguyen and Yiwu Zhong and Yin Li},
      year={2025},
      eprint={2503.19794},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.19794}, 
}

Downloads last month: 1,945

Safetensors

Model size

97.7M params

Tensor type

F32

Collection including nvidia/MambaVision-B-1K

MambaVision

Collection

MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. • 12 items • Updated 2 days ago • 36

Paper for nvidia/MambaVision-B-1K

PAVE: Patching and Adapting Video Large Language Models

Paper • 2503.19794 • Published Mar 25, 2025 • 3