CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion Paper • 2512.19535 • Published Dec 22, 2025 • 12
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery Paper • 2407.14499 • Published Jul 19, 2024
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation Paper • 2402.03119 • Published Feb 5, 2024
Better Understanding Differences in Attribution Methods via Systematic Evaluations Paper • 2303.11884 • Published Mar 21, 2023
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 9 items • Updated Dec 23, 2025 • 23