Visual Question Answering
Transformers
English
videollama2_mistral
text-generation
multimodal large language model
large video-language model
Instructions to use DAMO-NLP-SG/VideoLLaMA2-7B-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DAMO-NLP-SG/VideoLLaMA2-7B-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="DAMO-NLP-SG/VideoLLaMA2-7B-Base")# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("DAMO-NLP-SG/VideoLLaMA2-7B-Base", dtype="auto") - Notebooks
- Google Colab
- Kaggle
fix task tag
#1
by merve HF Staff - opened
README.md
CHANGED
|
@@ -9,7 +9,7 @@ language:
|
|
| 9 |
metrics:
|
| 10 |
- accuracy
|
| 11 |
library_name: transformers
|
| 12 |
-
pipeline_tag:
|
| 13 |
tags:
|
| 14 |
- multimodal large language model
|
| 15 |
- large video-language model
|
|
@@ -106,5 +106,4 @@ If you find VideoLLaMA useful for your research and applications, please cite us
|
|
| 106 |
year = {2023},
|
| 107 |
url = {https://arxiv.org/abs/2306.02858}
|
| 108 |
}
|
| 109 |
-
```
|
| 110 |
-
|
|
|
|
| 9 |
metrics:
|
| 10 |
- accuracy
|
| 11 |
library_name: transformers
|
| 12 |
+
pipeline_tag: video-text-to-text
|
| 13 |
tags:
|
| 14 |
- multimodal large language model
|
| 15 |
- large video-language model
|
|
|
|
| 106 |
year = {2023},
|
| 107 |
url = {https://arxiv.org/abs/2306.02858}
|
| 108 |
}
|
| 109 |
+
```
|
|
|