1 20 3

Mutian Xu

Minoday

https://mutianxu.github.io/

mutianxu

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

upvoted a paper 8 days ago

Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion

upvoted a paper 18 days ago

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

View all activity

Organizations

None yet

upvoted a paper 7 days ago

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

Paper • 2606.20515 • Published 8 days ago • 39

upvoted a paper 8 days ago

Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion

Paper • 2606.15236 • Published 10 days ago • 21

upvoted a paper 18 days ago

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

Paper • 2606.06601 • Published 22 days ago • 26

upvoted a paper 29 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published about 1 month ago • 75

upvoted a paper 30 days ago

SpatialBench: Is Your Spatial Foundation Model an All-Round Player?

Paper • 2605.27367 • Published May 26 • 72

upvoted 2 papers about 1 month ago

PhysX-Omni: Unified Simulation-Ready Physical 3D Generation for Rigid, Deformable, and Articulated Objects

Paper • 2605.21572 • Published May 20 • 55

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published May 12 • 194

liked a Space about 2 months ago

WiLoR

🚀

Reconstruct 3D hands from images

updated a model 2 months ago

Minoday/Kinema4D

Image-to-Video • Updated Apr 17 • 11 • 4

upvoted an article 2 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 167

liked a model 3 months ago

Minoday/Kinema4D

Image-to-Video • Updated Apr 17 • 11 • 4

updated a dataset 3 months ago

Minoday/Robo4D-200k

Viewer • Updated Apr 9 • 10 • 1.31k • 6

upvoted a paper 3 months ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published Apr 6 • 237

liked a dataset 3 months ago

Minoday/Robo4D-200k

Viewer • Updated Apr 9 • 10 • 1.31k • 6

published a model 3 months ago

Minoday/Kinema4D

Image-to-Video • Updated Apr 17 • 11 • 4

upvoted 3 papers 3 months ago

MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction

Paper • 2603.19231 • Published Mar 19 • 37

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

Paper • 2603.19227 • Published Mar 19 • 42

VLANeXt: Recipes for Building Strong VLA Models

Paper • 2602.18532 • Published Feb 20 • 52

authored a paper 3 months ago

Kinema4D: Kinematic 4D World Modeling for Spatiotemporal Embodied Simulation

Paper • 2603.16669 • Published Mar 17 • 70

upvoted a paper 3 months ago

Kinema4D: Kinematic 4D World Modeling for Spatiotemporal Embodied Simulation

Paper • 2603.16669 • Published Mar 17 • 70

Mutian Xu

AI & ML interests

Recent Activity

Organizations

Minoday's activity

WiLoR

NEO-unify: Building Native Multimodal Unified Models End to End