Why Far Looks Up: Probing Spatial Representation in Vision-Language Models Paper • 2605.30161 • Published 30 days ago • 60
Personalize-then-Store: Benchmarking and Learning Personalized Memory for Long-horizon Agents Paper • 2605.25535 • Published May 25 • 42
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Paper • 2411.04952 • Published Nov 7, 2024 • 29