distilbert-prompt-guard-4class

This is a 4-class DistilBERT classifier fine-tuned for:

  • benign
  • prompt_injection
  • jailbreak
  • sensitive_access

Base model: {BASE_MODEL}

Datasets used:

  • dmilush/shieldlm-prompt-injection
  • antijection/prompt-injection-dataset-v1
  • leolee99/NotInject

Notes:

  • sensitive_access is a custom merged class created from exfiltration / prompt-extraction / excessive-agency style attacks.
  • Please review upstream dataset licenses before commercial use.
Downloads last month
42
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KIMMISEON/distilbert-prompt-guard-4class

Finetuned
(11483)
this model

Datasets used to train KIMMISEON/distilbert-prompt-guard-4class