Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising Paper • 2604.26694 • Published 8 days ago • 6
Context Forcing: Consistent Autoregressive Video Generation with Long Context Paper • 2602.06028 • Published Feb 5 • 36
Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation Paper • 2602.02214 • Published Feb 2 • 24
UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers Paper • 2512.04504 • Published Dec 4, 2025 • 18
UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers Paper • 2511.20123 • Published Nov 25, 2025 • 18
VQ-VA World: Towards High-Quality Visual Question-Visual Answering Paper • 2511.20573 • Published Nov 25, 2025 • 7
LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation Paper • 2510.22946 • Published Oct 27, 2025 • 18
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding Paper • 2506.01853 • Published Jun 2, 2025 • 32