InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams Paper • 2601.02281 • Published 23 days ago • 33
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published Dec 18, 2025 • 74
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization Paper • 2511.23002 • Published Nov 28, 2025 • 26
Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation Paper • 2512.08186 • Published Dec 9, 2025 • 22
Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models Paper • 2511.00503 • Published Nov 1, 2025 • 2
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling Paper • 2512.03000 • Published Dec 2, 2025 • 37
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling Paper • 2512.03000 • Published Dec 2, 2025 • 37
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling Paper • 2512.03000 • Published Dec 2, 2025 • 37 • 3
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling Paper • 2512.03000 • Published Dec 2, 2025 • 37 • 3