Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
Paper
•
2601.14004
•
Published
•
46
Artificial Intelligence
Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark