Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs Paper • 2603.16932 • Published 24 days ago • 85
Alignment Makes Language Models Normative, Not Descriptive Paper • 2603.17218 • Published 21 days ago • 46
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 28 days ago • 75
VidVec: Unlocking Video MLLM Embeddings for Video-Text Retrieval Paper • 2602.08099 • Published Feb 8 • 124
Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention Paper • 2602.01801 • Published Feb 2 • 28
The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents Paper • 2601.11496 • Published Jan 16 • 47
Alterbute: Editing Intrinsic Attributes of Objects in Images Paper • 2601.10714 • Published Jan 15 • 31
DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion Paper • 2510.20766 • Published Oct 23, 2025 • 37
Advancing Speech Understanding in Speech-Aware Language Models with GRPO Paper • 2509.16990 • Published Sep 21, 2025 • 22
Beyond Transcription: Mechanistic Interpretability in ASR Paper • 2508.15882 • Published Aug 21, 2025 • 89
Story2Board: A Training-Free Approach for Expressive Storyboard Generation Paper • 2508.09983 • Published Aug 13, 2025 • 70
A Unifying Scheme for Extractive Content Selection Tasks Paper • 2507.16922 • Published Jul 22, 2025 • 4
Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games Paper • 2506.05309 • Published Jun 5, 2025 • 16