Rethinking RAG in Long Videos: What to Retrieve and How to Use It? Paper • 2606.13141 • Published 14 days ago • 36
STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding Paper • 2603.27593 • Published Mar 29 • 12
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models Paper • 2601.21181 • Published Jan 29 • 10
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models Paper • 2601.21181 • Published Jan 29 • 10
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models Paper • 2601.21181 • Published Jan 29 • 10
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Paper • 2506.15681 • Published Jun 18, 2025 • 43