Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models Paper • 2511.17487 • Published 20 days ago • 9
Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration Paper • 2412.13180 • Published Dec 17, 2024 • 13
Agentic Systems in Radiology: Design, Applications, Evaluation, and Challenges Paper • 2510.09404 • Published Oct 10
Blackbox Model Provenance via Palimpsestic Membership Inference Paper • 2510.19796 • Published Oct 22 • 3
Thinking While Listening: Simple Test Time Scaling For Audio Classification Paper • 2509.19676 • Published Sep 24 • 4
AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders Paper • 2501.17148 • Published Jan 28 • 1
Merlin: A Vision Language Foundation Model for 3D Computed Tomography Paper • 2406.06512 • Published Jun 10, 2024
MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders Paper • 2502.14753 • Published Feb 20 • 1
SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning Paper • 2506.21355 • Published Jun 26 • 10
Expert-level validation of AI-generated medical text with scalable language models Paper • 2507.03152 • Published Jul 3
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors Paper • 2505.11770 • Published May 17 • 2
Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data Paper • 2508.16783 • Published Aug 22 • 1
BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation Paper • 2403.09227 • Published Mar 14, 2024 • 1
BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities Paper • 2503.05652 • Published Mar 7 • 11
Re-thinking Temporal Search for Long-Form Video Understanding Paper • 2504.02259 • Published Apr 3 • 1