InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams Paper • 2601.02281 • Published 7 days ago • 30
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 Paper • 2510.19600 • Published Oct 22, 2025 • 68
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model Paper • 2510.12276 • Published Oct 14, 2025 • 146
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs Paper • 2507.11097 • Published Jul 15, 2025 • 64
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion Paper • 2507.06165 • Published Jul 8, 2025 • 59
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models Paper • 2506.10100 • Published Jun 11, 2025 • 9
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces Paper • 2506.00123 • Published May 30, 2025 • 35
Shifting AI Efficiency From Model-Centric to Data-Centric Compression Paper • 2505.19147 • Published May 25, 2025 • 144