-
agentlans/prompt-safety-classification
Viewer • Updated • 72.1k • 42 -
Jammies-io/safety-refusal
Viewer • Updated • 100 • 7 -
RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models
Paper • 2510.10390 • Published • 5 -
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 4.72k • 79
Daniel Bis
danielbis
·
AI & ML interests
https://scholar.google.com/citations?user=ArMgXHYAAAAJ&hl=en
Organizations
None yet
safety
-
agentlans/prompt-safety-classification
Viewer • Updated • 72.1k • 42 -
Jammies-io/safety-refusal
Viewer • Updated • 100 • 7 -
RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models
Paper • 2510.10390 • Published • 5 -
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 4.72k • 79
Datasets
models 0
None public yet
datasets 0
None public yet