moderation-prompts
updated
mmathys/openai-moderation-api-evaluation
Viewer
• Updated • 1.68k • 1.44k
• 35
Viewer
• Updated • 169k • 35.6k
• 1.7k
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks,
and Refusals of LLMs
Paper
• 2406.18495
• Published • 13
ShieldGemma: Generative AI Content Moderation Based on Gemma
Paper
• 2407.21772
• Published • 14
Viewer
• Updated • 1M • 8.1k
• 867
PKU-Alignment/BeaverTails
Viewer
• Updated • 364k • 15k
• 101
AgentPublic/camembert-base-toxic-fr-user-prompts
Text Classification
• 0.1B • Updated • 124
• 7
Viewer
• Updated • 30.4k • 1.12k
• 29
meta-llama/Llama-Guard-3-8B
Text Generation
• 8B • Updated • 116k
• • 288
davanstrien/aart-ai-safety-dataset
Viewer
• Updated • 3.27k • 32
• 2
Viewer
• Updated • 520 • 9.99k
• 95