MambaVision
Collection
MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. • 12 items • Updated • 36
# Load model directly
from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("nvidia/MambaVision-B-1K", trust_remote_code=True, dtype="auto")This repository contains the data for the paper PAVE: Patching and Adapting Video Large Language Models.
Code: https://github.com/dragonlzm/PAVE
arxiv.org/abs/2503.19794
BibTeX:
@misc{liu2025pavepatchingadaptingvideo,
title={PAVE: Patching and Adapting Video Large Language Models},
author={Zhuoming Liu and Yiquan Li and Khoi Duc Nguyen and Yiwu Zhong and Yin Li},
year={2025},
eprint={2503.19794},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.19794},
}
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="nvidia/MambaVision-B-1K", trust_remote_code=True) pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")