Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -4,7 +4,7 @@ emoji: π¬
|
|
| 4 |
colorFrom: yellow
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
-
sdk_version:
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
hf_oauth: true
|
|
@@ -14,4 +14,57 @@ license: mit
|
|
| 14 |
short_description: 'Interactive demo of phishing email detection with AI, based '
|
| 15 |
---
|
| 16 |
|
| 17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
colorFrom: yellow
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
+
sdk_version: 6.2.0
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
hf_oauth: true
|
|
|
|
| 14 |
short_description: 'Interactive demo of phishing email detection with AI, based '
|
| 15 |
---
|
| 16 |
|
| 17 |
+
## π§ How it works
|
| 18 |
+
|
| 19 |
+
- The user provides the raw text of an email.
|
| 20 |
+
- The text is processed by a **fine-tuned BERT model** for binary classification:
|
| 21 |
+
- `PHISHING`
|
| 22 |
+
- `SAFE`
|
| 23 |
+
- The model outputs a label and a confidence score.
|
| 24 |
+
- A threshold-based policy is applied:
|
| 25 |
+
- High-confidence phishing β π¨ **PHISHING**
|
| 26 |
+
- High-confidence safe β π’ **LOW RISK**
|
| 27 |
+
- Intermediate confidence β π **REVIEW recommended**
|
| 28 |
+
|
| 29 |
+
This approach reflects a **security-oriented mindset**, where uncertain cases are intentionally flagged for manual review.
|
| 30 |
+
|
| 31 |
+
---
|
| 32 |
+
|
| 33 |
+
## π¬ Model and data
|
| 34 |
+
|
| 35 |
+
- **Model**: `ElSlay/BERT-Phishing-Email-Model`
|
| 36 |
+
- **Task**: Text classification (phishing vs safe)
|
| 37 |
+
- **Dataset**:
|
| 38 |
+
[`zefang-liu/phishing-email-dataset`](https://huggingface.co/datasets/zefang-liu/phishing-email-dataset)
|
| 39 |
+
|
| 40 |
+
The model is used in **inference-only mode**; no training is performed within this demo.
|
| 41 |
+
|
| 42 |
+
---
|
| 43 |
+
|
| 44 |
+
## π― Project context
|
| 45 |
+
|
| 46 |
+
This demo is part of a broader experimental effort related to **PhishForge**, an evolving framework focused on phishing analysis and threat detection using AI-driven techniques.
|
| 47 |
+
|
| 48 |
+
The goal is to explore how NLP models can be integrated into practical cybersecurity workflows in a transparent and interpretable way.
|
| 49 |
+
|
| 50 |
+
---
|
| 51 |
+
|
| 52 |
+
## π Notes and limitations
|
| 53 |
+
|
| 54 |
+
- The analysis is **content-based only**.
|
| 55 |
+
- Email headers, metadata, URLs reputation, and attachments are not evaluated.
|
| 56 |
+
- The demo is intended for **educational and experimental purposes**, not for production use.
|
| 57 |
+
|
| 58 |
+
---
|
| 59 |
+
|
| 60 |
+
## π Citation
|
| 61 |
+
|
| 62 |
+
A citable version of this project is available via Zenodo: https://huggingface.co/spaces/giulcs008/phishing-email-detector
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
---
|
| 66 |
+
|
| 67 |
+
## π€ Author
|
| 68 |
+
|
| 69 |
+
**Giulia Casaldi**
|
| 70 |
+
Cybersecurity & AI
|