Lance: Unified Multimodal Modeling by Multi-Task Synergy Paper • 2605.18678 • Published 3 days ago • 67
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 3 days ago • 102
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published 28 days ago • 36
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper • 2604.19636 • Published about 1 month ago • 87
HDR Video Generation via Latent Alignment with Logarithmic Encoding Paper • 2604.11788 • Published Apr 13 • 10
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published Apr 15 • 120
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 162
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published Apr 13 • 72
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published Apr 10 • 50
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published Mar 26 • 53
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published Mar 26 • 155
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published Mar 23 • 125
Versatile Editing of Video Content, Actions, and Dynamics without Training Paper • 2603.17989 • Published Mar 18 • 17
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing Paper • 2603.19228 • Published Mar 19 • 68
WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation Paper • 2603.16871 • Published Mar 17 • 61
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published Mar 16 • 153
OmniForcing: Unleashing Real-time Joint Audio-Visual Generation Paper • 2603.11647 • Published Mar 12 • 31
ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation Paper • 2603.11421 • Published Mar 12 • 34