Yuxin Chen
Uasonchen
AI & ML interests
None yet
Recent Activity
updated a collection 3 days ago
Video Understanding authored a paper 3 months ago
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models upvoted a paper 3 months ago
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language ModelsOrganizations
Image Generation
Vision Foundation Model
MLLM
-
Apriel-1.5-15b-Thinker
Paper • 2510.01141 • Published • 123 -
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
Paper • 2509.21268 • Published • 104 -
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Paper • 2509.00676 • Published • 85 -
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84
Math Data Synthesis
Agent
Image Editing
Video Generation
-
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation
Paper • 2510.02283 • Published • 98 -
Paper2Video: Automatic Video Generation from Scientific Papers
Paper • 2510.05096 • Published • 120 -
LongLive: Real-time Interactive Long Video Generation
Paper • 2509.22622 • Published • 189 -
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
Paper • 2509.08519 • Published • 130
Open Math Data for LLM
Video Understanding
Agent
Image Generation
Image Editing
Vision Foundation Model
Video Generation
-
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation
Paper • 2510.02283 • Published • 98 -
Paper2Video: Automatic Video Generation from Scientific Papers
Paper • 2510.05096 • Published • 120 -
LongLive: Real-time Interactive Long Video Generation
Paper • 2509.22622 • Published • 189 -
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
Paper • 2509.08519 • Published • 130
MLLM
-
Apriel-1.5-15b-Thinker
Paper • 2510.01141 • Published • 123 -
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
Paper • 2509.21268 • Published • 104 -
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Paper • 2509.00676 • Published • 85 -
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84
Open Math Data for LLM
Math Data Synthesis