Building a Precise Video Language with Human-AI Oversight Paper • 2604.21718 • Published 16 days ago • 15
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers Paper • 2412.00142 • Published Nov 28, 2024 • 5