MLLM-CL: Continual Learning for Multimodal Large Language Models Paper • 2506.05453 • Published Jun 5, 2025 • 3
Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection Paper • 2409.04796 • Published Sep 7, 2024 • 1
ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt Paper • 2410.05849 • Published Oct 8, 2024 • 1
HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model Paper • 2503.12941 • Published Mar 17, 2025 • 1
MambaIC: State Space Models for High-Performance Learned Image Compression Paper • 2503.12461 • Published Mar 16, 2025 • 2
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published 25 days ago • 155
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality Paper • 2505.18227 • Published May 23, 2025 • 15
VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression? Paper • 2512.15649 • Published Dec 17, 2025 • 7
VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression? Paper • 2512.15649 • Published Dec 17, 2025 • 7
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published Dec 8, 2025 • 38
Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection Paper • 2409.04796 • Published Sep 7, 2024 • 1
DESIRE: Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning Paper • 2411.19154 • Published Nov 28, 2024
EventVAD: Training-Free Event-Aware Video Anomaly Detection Paper • 2504.13092 • Published Apr 17, 2025
MambaIC: State Space Models for High-Performance Learned Image Compression Paper • 2503.12461 • Published Mar 16, 2025 • 2
PPT: Token Pruning and Pooling for Efficient Vision Transformers Paper • 2310.01812 • Published Oct 3, 2023
ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt Paper • 2410.05849 • Published Oct 8, 2024 • 1
Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models Paper • 2503.20492 • Published Mar 26, 2025