prompt-armor
/

l5-negative-selection

+---
+license: apache-2.0
+tags:
+  - prompt-injection
+  - security
+  - anomaly-detection
+  - sklearn
+  - isolation-forest
+pipeline_tag: text-classification
+---
+# L5 Negative Selection — prompt-armor
+Isolation Forest anomaly detection model for detecting zero-day prompt injection attacks. Learns what "normal" prompts look like and flags deviations.
+## Model Details
+- **Algorithm**: scikit-learn IsolationForest
+- **Training data**: 5,000 benign prompts from 5 public datasets
+- **Features**: 11 statistical text features
+- **Inference**: <1ms (tree traversal)
+- **File size**: ~1.1MB
+## Features Extracted
+1. Word count
+2. Character count
+3. Sentence count
+4. Average word length
+5. Average sentence length
+6. Imperative verb ratio
+7. Question mark ratio
+8. Special character density
+9. Shannon entropy
+10. Uppercase ratio
+11. Unique word ratio (vocabulary diversity)
+## Usage
+```python
+import joblib
+from prompt_armor.layers.l5_negative_selection import _extract_l5_features
+data = joblib.load("l5_negative_selection.pkl")
+model = data["model"]
+features = _extract_l5_features("your text here")
+raw_score = model.decision_function(features.reshape(1, -1))[0]
+# Normalize: more negative = more anomalous
+score = (data["score_max"] - raw_score) / (data["score_max"] - data["score_min"])
+score = max(0.0, min(1.0, score))
+```
+## Part of prompt-armor
+This model is used by [prompt-armor](https://github.com/prompt-armor/prompt-armor) — an open-source prompt injection detector. Auto-downloaded on first use.
+## License
+Apache 2.0