Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Paper • 2504.11168 • Published Apr 15, 2025 • 2
qualifire/prompt-injection-jailbreak-sentinel-v2 Text Classification • 0.6B • Updated Sep 28, 2025 • 2.18k • 27
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Defensive-1.0 Text Classification • Updated Sep 22, 2025 • 2.6k • 26
nvidia/Aegis-AI-Content-Safety-LlamaGuard-Permissive-1.0 Text Classification • Updated Sep 22, 2025 • 42 • 18
ShieldGemma Collection ShieldGemma is a family of models for text and image content moderation. • 4 items • Updated Jul 10, 2025 • 11