Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation Paper • 2606.26907 • Published 5 days ago • 46
DomainShuttle: Freeform Open Domain Subject-driven Text-to-video Generation Paper • 2606.26058 • Published 6 days ago • 65
Echo-Infinity: Learning Evolving Memory for Real-Time Infinite Video Generation Paper • 2606.04527 • Published 27 days ago • 28
GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration Paper • 2605.31039 • Published May 29 • 46
SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control Paper • 2605.27891 • Published May 27 • 8
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Paper • 2605.30263 • Published May 28 • 59
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published May 20 • 111
LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs Paper • 2605.17260 • Published May 17 • 25
KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration Paper • 2605.14278 • Published May 14 • 37
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published May 18 • 116
DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models Paper • 2605.15055 • Published May 14 • 19
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation Paper • 2605.15141 • Published May 14 • 96
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published May 13 • 105
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 194