moderation-prompts
updated
mmathys/openai-moderation-api-evaluation
Viewer
• Updated
• 1.68k • 236
• 35
Viewer
• Updated
• 169k • 17.6k
• 1.66k
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks,
and Refusals of LLMs
Paper
• 2406.18495
• Published
• 13
ShieldGemma: Generative AI Content Moderation Based on Gemma
Paper
• 2407.21772
• Published
• 14
Viewer
• Updated
• 1M • 6.98k
• 841
PKU-Alignment/BeaverTails
Viewer
• Updated
• 364k • 14.2k
• 95
AgentPublic/camembert-base-toxic-fr-user-prompts
Text Classification
• 0.1B • Updated
• 94
• 7
Viewer
• Updated
• 30.4k • 615
• 27
meta-llama/Llama-Guard-3-8B
Text Generation
• 8B • Updated
• 122k
• • 269
davanstrien/aart-ai-safety-dataset
Viewer
• Updated
• 3.27k • 14
• 2
Viewer
• Updated
• 520 • 10k
• 85