view article Article Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning and Action nvidia • 30 days ago • 84
Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos Paper • 2605.18984 • Published May 18 • 22
Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling Paper • 2605.13062 • Published May 13 • 33
Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization Paper • 2605.10780 • Published May 12 • 33
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published Apr 6 • 236
BrowseComp-V^3: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents Paper • 2602.12876 • Published Feb 13 • 14
How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing Paper • 2602.01851 • Published Feb 2 • 16
FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion Paper • 2506.04648 • Published Jun 5, 2025 • 1
OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing Paper • 2509.24900 • Published Sep 29, 2025 • 54
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark Paper • 2509.24897 • Published Sep 29, 2025 • 46