MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation Paper • 2512.18181 • Published 20 days ago • 85
UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 26 days ago • 84
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published 27 days ago • 71
Flow-OPD: On-Policy Distillation for Flow Matching Models Paper • 2605.08063 • Published 19 days ago • 97
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published 27 days ago • 90
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 24 days ago • 114
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 21 days ago • 100
Stream-T1: Test-Time Scaling for Streaming Video Generation Paper • 2605.04461 • Published 21 days ago • 103
ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration Paper • 2605.03042 • Published 23 days ago • 120
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation Paper • 2605.03849 • Published 22 days ago • 124
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 24 days ago • 163
Heterogeneous Scientific Foundation Model Collaboration Paper • 2604.27351 • Published 27 days ago • 218
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 23 days ago • 341
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 20 days ago • 229
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published Dec 19, 2025 • 99
Infinite-Homography as Robust Conditioning for Camera-Controlled Video Generation Paper • 2512.17040 • Published Dec 18, 2025 • 29
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 124
PHUMA: Physically-Grounded Humanoid Locomotion Dataset Paper • 2510.26236 • Published Oct 30, 2025 • 30
ACG: Action Coherence Guidance for Flow-based VLA models Paper • 2510.22201 • Published Oct 25, 2025 • 37