Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos Paper • 2602.23543 • Published 14 days ago • 4
Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos Paper • 2602.23543 • Published 14 days ago • 4
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11, 2025 • 44
MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation Paper • 2602.11337 • Published 29 days ago • 6
VLS: Steering Pretrained Robot Policies via Vision-Language Models Paper • 2602.03973 • Published Feb 3 • 22
Patient-Similarity Cohort Reasoning in Clinical Text-to-SQL Paper • 2601.09876 • Published Jan 14 • 7
SuperBPE Collection SuperBPE tokenizers and models trained with them • 8 items • Updated 11 days ago • 17
Structure From Tracking: Distilling Structure-Preserving Motion for Video Generation Paper • 2512.11792 • Published Dec 12, 2025 • 10
SuperBPE Collection SuperBPE tokenizers and models trained with them • 8 items • Updated 11 days ago • 17
MVTamperBench: Evaluating Robustness of Vision-Language Models Paper • 2412.19794 • Published Dec 27, 2024 • 4
SuperBPE Collection SuperBPE tokenizers and models trained with them • 8 items • Updated 11 days ago • 17