SketchVLM: Vision language models can annotate images to explain thoughts and guide users Paper • 2604.22875 • Published 16 days ago • 34
Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence Paper • 2501.05555 • Published Jan 9, 2025 • 1
TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models Paper • 2412.18675 • Published Dec 24, 2024 • 1
VideoGameBunny: Towards vision assistants for video games Paper • 2407.15295 • Published Jul 21, 2024 • 23