virtcaio commited on
Commit
a0d2554
·
verified ·
1 Parent(s): 9751f93

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - prompt-injection
5
+ - security
6
+ - anomaly-detection
7
+ - sklearn
8
+ - isolation-forest
9
+ pipeline_tag: text-classification
10
+ ---
11
+
12
+ # L5 Negative Selection — prompt-armor
13
+
14
+ Isolation Forest anomaly detection model for detecting zero-day prompt injection attacks. Learns what "normal" prompts look like and flags deviations.
15
+
16
+ ## Model Details
17
+
18
+ - **Algorithm**: scikit-learn IsolationForest
19
+ - **Training data**: 5,000 benign prompts from 5 public datasets
20
+ - **Features**: 11 statistical text features
21
+ - **Inference**: <1ms (tree traversal)
22
+ - **File size**: ~1.1MB
23
+
24
+ ## Features Extracted
25
+
26
+ 1. Word count
27
+ 2. Character count
28
+ 3. Sentence count
29
+ 4. Average word length
30
+ 5. Average sentence length
31
+ 6. Imperative verb ratio
32
+ 7. Question mark ratio
33
+ 8. Special character density
34
+ 9. Shannon entropy
35
+ 10. Uppercase ratio
36
+ 11. Unique word ratio (vocabulary diversity)
37
+
38
+ ## Usage
39
+
40
+ ```python
41
+ import joblib
42
+ from prompt_armor.layers.l5_negative_selection import _extract_l5_features
43
+
44
+ data = joblib.load("l5_negative_selection.pkl")
45
+ model = data["model"]
46
+
47
+ features = _extract_l5_features("your text here")
48
+ raw_score = model.decision_function(features.reshape(1, -1))[0]
49
+
50
+ # Normalize: more negative = more anomalous
51
+ score = (data["score_max"] - raw_score) / (data["score_max"] - data["score_min"])
52
+ score = max(0.0, min(1.0, score))
53
+ ```
54
+
55
+ ## Part of prompt-armor
56
+
57
+ This model is used by [prompt-armor](https://github.com/prompt-armor/prompt-armor) — an open-source prompt injection detector. Auto-downloaded on first use.
58
+
59
+ ## License
60
+
61
+ Apache 2.0