False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize Paper • 2509.03888 • Published Sep 4, 2025 • 4 • 3