Text Classification
Transformers
Safetensors
English
Japanese
distilbert
prompt-injection
security
text-embeddings-inference
Instructions to use kanekoyuichi/promptgate-classifier-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kanekoyuichi/promptgate-classifier-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="kanekoyuichi/promptgate-classifier-v2")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("kanekoyuichi/promptgate-classifier-v2") model = AutoModelForSequenceClassification.from_pretrained("kanekoyuichi/promptgate-classifier-v2") - Notebooks
- Google Colab
- Kaggle
promptgate-classifier-v2
Fine-tuned DistilBERT sequence classifier for PromptGate prompt-injection screening.
Labels:
SAFE: benign inputATTACK: prompt-injection-like input
This model is intended for use through PromptGate:
from promptgate import PromptGate
gate = PromptGate(detectors=["rule", "classifier"])
result = gate.scan("Ignore all previous instructions.")
Reference holdout results used during development:
| Detector | Recall | Specificity | Precision | Accuracy |
|---|---|---|---|---|
| classifier v2 @ 0.5 | 92.5% | 85.0% | 86.0% | 88.8% |
These numbers are reference values for a fixed development holdout and are not a guarantee of production performance.
- Downloads last month
- 10