InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models Paper • 2410.22770 • Published Oct 30, 2024
Doxing via the Lens: Revealing Privacy Leakage in Image Geolocation for Agentic Multi-Modal Large Reasoning Model Paper • 2504.19373 • Published Apr 27, 2025 • 1
Code Agent can be an End-to-end System Hacker: Benchmarking Real-world Threats of Computer-use Agent Paper • 2510.06607 • Published Oct 8, 2025 • 4
AutoDAN-Reasoning: Enhancing Strategies Exploration based Jailbreak Attacks with Test-Time Scaling Paper • 2510.05379 • Published Oct 6, 2025
ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models Paper • 2602.00154 • Published 24 days ago
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models Paper • 2310.04451 • Published Oct 3, 2023 • 1
JailBreakV-28K: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak Attacks Paper • 2404.03027 • Published Apr 3, 2024 • 3