Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning Paper • 2510.11027 • Published Oct 13 • 21
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers Paper • 2508.21148 • Published Aug 28 • 140
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces Paper • 2506.00123 • Published May 30 • 35
EnerVerse-AC: Envisioning Embodied Environments with Action Condition Paper • 2505.09723 • Published May 14 • 23
EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models Paper • 2505.09694 • Published May 14 • 19
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy Paper • 2503.19757 • Published Mar 25 • 51
Adversarial Data Collection: Human-Collaborative Perturbations for Efficient and Robust Robotic Imitation Learning Paper • 2503.11646 • Published Mar 14 • 35