VideoLLaMA2 - a DAMO-NLP-SG Collection

DAMO-NLP-SG 's Collections

VideoLLaMA2

updated Sep 2, 2025

Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability

Running on Zero

Agents

160

VideoLLaMA2

🎥

160

Media understanding
Configuration error

Agents

17

VideoLLaMA2 AV

🚀

17

VideoLLaMA2-AV
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Paper • 2406.07476 • Published Jun 11, 2024 • 37
DAMO-NLP-SG/VideoLLaMA2.1-7B-16F

Video-Text-to-Text • 8B • Updated Sep 4, 2025 • 80 • 9
DAMO-NLP-SG/VideoLLaMA2.1-7B-AV

Visual Question Answering • 9B • Updated Oct 25, 2024 • 1.82k • 16
DAMO-NLP-SG/VideoLLaMA2-7B

Visual Question Answering • 8B • Updated Aug 13, 2024 • 178 • 40
DAMO-NLP-SG/VideoLLaMA2-7B-16F

Visual Question Answering • 8B • Updated Aug 13, 2024 • 707 • 14
DAMO-NLP-SG/VideoLLaMA2-72B

Visual Question Answering • 75B • Updated Aug 14, 2024 • 25 • 10
DAMO-NLP-SG/VideoLLaMA2-72B-Base

Visual Question Answering • Updated Aug 13, 2024 • 14 • 1
DAMO-NLP-SG/VideoLLaMA2-7B-Base

Visual Question Answering • Updated Aug 13, 2024 • 21 • 5
DAMO-NLP-SG/VideoLLaMA2-7B-16F-Base

Visual Question Answering • Updated Aug 13, 2024 • 18 • 2
DAMO-NLP-SG/VideoLLaMA2.1-7B-16F-Base

Visual Question Answering • Updated Oct 21, 2024 • 35 • 1
DAMO-NLP-SG/Multi-Source-Video-Captioning

Viewer • Updated Jun 17, 2024 • 1.5k • 82 • 7