SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 10 items • Updated 15 days ago • 74
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? Paper • 2606.19531 • Published 11 days ago • 20
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? Paper • 2606.19531 • Published 11 days ago • 20
Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation Paper • 2510.09320 • Published Oct 10, 2025 • 3
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model Paper • 2602.10098 • Published Feb 10 • 22
ReWorld: Multi-Dimensional Reward Modeling for Embodied World Models Paper • 2601.12428 • Published Jan 18
Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining Paper • 2604.16391 • Published Mar 27 • 4
Learning Athletic Humanoid Tennis Skills from Imperfect Human Motion Data Paper • 2603.12686 • Published Mar 13
Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking Paper • 2606.03985 • Published 26 days ago • 41
Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking Paper • 2606.03985 • Published 26 days ago • 41
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published about 1 month ago • 146
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation Paper • 2602.16705 • Published Feb 18 • 26
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model Paper • 2602.10098 • Published Feb 10 • 22