BabyVision Collection State-of-the-art MLLMs achieve PhD-level language reasoning but struggle with visual tasks that 3-year-olds solve effortlessly. • 2 items • Updated 16 days ago • 4
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published Dec 22, 2025 • 64
Exploring MLLM-Diffusion Information Transfer with MetaCanvas Paper • 2512.11464 • Published Dec 12, 2025 • 13
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper • 2511.13648 • Published Nov 17, 2025 • 53
Simulating the Visual World with Artificial Intelligence: A Roadmap Paper • 2511.08585 • Published Nov 11, 2025 • 30
The Quest for Generalizable Motion Generation: Data, Model, and Evaluation Paper • 2510.26794 • Published Oct 30, 2025 • 27
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark Paper • 2510.13759 • Published Oct 15, 2025 • 11
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation Paper • 2510.05094 • Published Oct 6, 2025 • 38
Stencil: Subject-Driven Generation with Context Guidance Paper • 2509.17120 • Published Sep 21, 2025 • 6
CineScale: Free Lunch in High-Resolution Cinematic Visual Generation Paper • 2508.15774 • Published Aug 21, 2025 • 20
Cut2Next: Generating Next Shot via In-Context Tuning Paper • 2508.08244 • Published Aug 11, 2025 • 13
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models Paper • 2506.21356 • Published Jun 26, 2025 • 22
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning Paper • 2506.13654 • Published Jun 16, 2025 • 43
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness Paper • 2503.21755 • Published Mar 27, 2025 • 33
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models Paper • 2503.18886 • Published Mar 24, 2025 • 24
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published Jan 15, 2025 • 15
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper • 2412.09645 • Published Dec 10, 2024 • 36
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models Paper • 2411.13503 • Published Nov 20, 2024 • 34