RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models Paper • 2603.21341 • Published 4 days ago • 23
SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning Paper • 2603.22057 • Published 3 days ago • 41
Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues Paper • 2506.00958 • Published Jun 1, 2025 • 20