rudycaz commited on
Commit
3331bc8
·
verified ·
1 Parent(s): 86e7654

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -24
README.md CHANGED
@@ -1,26 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # ModernBERT Phishing Detector
2
 
3
- This project fine-tunes ModernBERT-base for phishing email detection.
4
-
5
- Artifacts:
6
- - `models/modernbert_phish/` -> PyTorch fine-tuned model
7
- - `onnx/modernbert_phish/model.onnx` -> ONNX export
8
- - `onnx/modernbert_phish/model.int8.onnx` -> quantized ONNX export
9
- - `models/modernbert_phish/calibration.json` -> score calibration values
10
-
11
- Scoring:
12
- - margin = phish_logit - safe_logit
13
- - probability = sigmoid(coef * margin + intercept)
14
- - score_0_10 = round(10 * probability)
15
- - score_1_10 = max(1, round(10 * probability))
16
-
17
- Suggested UI colors:
18
- - 0-2 = green
19
- - 3-5 = yellow
20
- - 6-7 = orange
21
- - 8-10 = red
22
-
23
- Evidence extraction:
24
- - split the email into sentences
25
- - score each sentence independently
26
- - return the highest-risk sentence as the explanation text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - cybersecurity
9
+ - phishing-detection
10
+ - email-security
11
+ - text-classification
12
+ - onnx
13
+ - int8
14
+ - modernbert
15
+ base_model: answerdotai/ModernBERT-base
16
+ base_model_relation: finetune
17
+ widget:
18
+ - text: "Subject: Security Alert\n\nBody:\nYour account has been locked. Please reply with your password immediately to restore access."
19
+ example_title: "Phishing-like email"
20
+ - text: "Subject: Team Lunch Reminder\n\nBody:\nReminder that the team lunch is tomorrow at 12:30 PM in the office kitchen."
21
+ example_title: "Benign email"
22
+ ---
23
+
24
  # ModernBERT Phishing Detector
25
 
26
+ ## Model description
27
+
28
+ This model is a fine-tuned **ModernBERT-base** binary sequence classifier for **phishing email detection**. It takes a full email as input text and predicts whether the email is **safe** or **phishing**.
29
+
30
+ The training backbone is **`answerdotai/ModernBERT-base`**, and the final release includes:
31
+ - a fine-tuned PyTorch checkpoint
32
+ - an ONNX export
33
+ - a quantized INT8 ONNX export
34
+ - a calibration file for mapping logits to a user-facing phishing score
35
+
36
+ ## Intended use
37
+
38
+ This model is intended for:
39
+ - phishing detection in email text
40
+ - mobile or backend inference through ONNX Runtime
41
+ - UI risk scoring, such as a **0–10** or **1–10** phishing scale
42
+ - evidence extraction via sentence-level rescoring
43
+
44
+ This model is **not** intended for:
45
+ - malware analysis
46
+ - attachment sandboxing
47
+ - URL detonation
48
+ - image/PDF threat inspection
49
+ - general prompt-injection detection
50
+ - fully explainable token-level rationale extraction
51
+
52
+ ## Inputs
53
+
54
+ The model expects a single text string representing the email content.
55
+
56
+ Example format:
57
+
58
+ ```text
59
+ Subject: Urgent Account Notice
60
+
61
+ Body:
62
+ Your account has been locked. Please reply with your password immediately to restore access.