M^3Eval: Multi-Modal Memory Evaluation through Cognitively-Grounded Video Tasks Paper • 2606.05008 • Published 1 day ago • 24
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding Paper • 2401.09340 • Published Jan 17, 2024 • 21