Orr Zohar's picture

Orr Zohar PRO

orrzohar

·

https://ai.stanford.edu/~orrzohar/

AI & ML interests

Large Multi-Modal Models, Foundation Models, Video Understanding

Organizations

authored a paper about 1 year ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 207

authored a paper over 1 year ago

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published Jan 16, 2025 • 35

authored 3 papers almost 2 years ago

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Paper • 2407.06189 • Published Jul 8, 2024 • 27

Open World Object Detection in the Era of Foundation Models

Paper • 2312.05745 • Published Dec 10, 2023 • 1

PROB: Probabilistic Objectness for Open World Object Detection

Paper • 2212.01424 • Published Dec 2, 2022

authored a paper about 2 years ago

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

Paper • 2403.10517 • Published Mar 15, 2024 • 37

authored a paper almost 3 years ago

LOVM: Language-Only Vision Model Selection

Paper • 2306.08893 • Published Jun 15, 2023 • 7