Perceptio: Perception Enhanced Vision Language Models via Spatial Token Generation Paper • 2603.18795 • Published 6 days ago • 10
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published Nov 14, 2025 • 112