Wenxuan Wang

wx9Songs

·

wxuan-w

AI & ML interests

Music & MLLM

Organizations

None yet

upvoted a collection 3 months ago

MOSS-Audio

An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 9 items • Updated 23 days ago • 66

upvoted 2 papers 8 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 242

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Paper • 2510.23763 • Published Oct 27, 2025 • 62