Continuous-Time Distribution Matching for Few-Step Diffusion Distillation Paper • 2605.06376 • Published 4 days ago • 24
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation Paper • 2605.03849 • Published 6 days ago • 118
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL Paper • 2604.28123 • Published 10 days ago • 44
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 12 days ago • 102
Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items Paper • 2604.19748 • Published 20 days ago • 249
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 26 days ago • 118
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 26 days ago • 157
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published Apr 8 • 41
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 187
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published Apr 6 • 203
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published Mar 30 • 85
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published Mar 31 • 48
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published Mar 31 • 46
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward Paper • 2603.26599 • Published Mar 27 • 65