VLM - a RzZ Collection

RzZ 's Collections

VLM

updated Dec 27, 2025

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

Paper • 2312.15715 • Published Dec 25, 2023 • 20
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Paper • 2505.23747 • Published May 29, 2025 • 69
VideoPrism: A Foundational Visual Encoder for Video Understanding

Paper • 2402.13217 • Published Feb 20, 2024 • 41
Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10, 2025 • 161
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Paper • 2512.17012 • Published Dec 18, 2025 • 49
Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published Dec 24, 2025 • 70