Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published 7 days ago • 86
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published Apr 6 • 235
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published Mar 3 • 87
Enhancing Spatial Understanding in Image Generation via Reward Modeling Paper • 2602.24233 • Published Feb 27 • 60
UniT: Unified Multimodal Chain-of-Thought Test-time Scaling Paper • 2602.12279 • Published Feb 12 • 20