Cog3DMap: Multi-View Vision-Language Reasoning with 3D Cognitive Maps Paper • 2603.23023 • Published 3 days ago • 20
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model Paper • 2603.05438 • Published 22 days ago • 39
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11, 2025 • 44
Affogato: Learning Open-Vocabulary Affordance Grounding with Automated Data Generation at Scale Paper • 2506.12009 • Published Jun 13, 2025 • 2
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling Paper • 2504.13169 • Published Apr 17, 2025 • 39