Collections of ICLR 2026 paper: "OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models"
Zekun Qi
qizekun
AI & ML interests
Embodied Intelligence, Large Langugae Model, 3D Computer Vision
Recent Activity
authored a paper about 9 hours ago
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? authored a paper 15 days ago
LIMMT: Less is More for Motion Tracking