File size: 5,481 Bytes
6948870
25131cd
6948870
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25131cd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
---
license: cc
language:
- en
base_model:
- distilbert/distilbert-base-uncased
---

# Model Card for PhishingDistilBERT

## Model Summary

**PhishingDistilBERT** is a DistilBERT-based NLP model fine-tuned specifically for email understanding tasks, particularly phishing and suspicious email detection.  
The model introduces **custom special tokens** to explicitly encode email structure such as subject, body, links, and phone numbers, making it more robust for email-based security applications.

It can be used both as:
- a **sequence classification model** for email safety detection, and  
- an **embedding generator** for downstream ML pipelines (e.g., XGBoost).

---

## Model Details

### Model Description

This model is fine-tuned from `distilbert-base-uncased` on curated email datasets. During preprocessing, email-specific entities such as URLs and phone numbers are replaced with dedicated tokens, and the subject and body are explicitly separated using structural markers.

**Special Tokens Used**
- `[SSUB]`, `[ESUB]` – Start/End of Subject  
- `[SBODY]`, `[EBODY]` – Start/End of Body  
- `[LINK]` – URLs  
- `[PHONE]` – Phone numbers  

These design choices help the model better learn semantic and structural patterns commonly found in phishing emails.

- **Developed by:** Atharva Gaykar  
- **Model type:** Transformer-based text classification & embedding model  
- **Language:** English  
- **License:** Artistic-2.0  
- **Finetuned from:** distilbert/distilbert-base-uncased  

---

## Intended Uses

### Primary Use Cases
- Phishing email classification  
- Suspicious vs safe email detection  
- Feature extraction for traditional ML models  
- Email embedding generation for downstream classifiers  

### Out-of-Scope Uses
- Non-text email analysis (images, attachments)  
- Commercial deployment without proper evaluation and compliance  
- Tasks unrelated to email or message-level text analysis  

---

## Bias, Risks, and Limitations

- The model is trained on public phishing datasets and may reflect biases present in those sources.
- Performance may degrade on highly obfuscated or novel phishing techniques.
- Not recommended for direct commercial use without extensive validation.

Users should carefully evaluate the model in their target environment before deployment.

---

## How to Get Started

```python
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch
import numpy as np

bert_path = "Gaykar/PhishingDistilBERT"

tokenizer = DistilBertTokenizerFast.from_pretrained(bert_path)
model = DistilBertForSequenceClassification.from_pretrained(bert_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

def get_cls_embedding(text, model, tokenizer, device):
    with torch.no_grad():
        inputs = tokenizer(
            text,
            return_tensors="pt",
            truncation=True,
            padding=True,
            max_length=256
        )
        inputs = {k: v.to(device) for k, v in inputs.items()}
        outputs = model.distilbert(**inputs)
        cls_embedding = outputs.last_hidden_state[:, 0, :].squeeze().cpu().numpy()
    return cls_embedding

text = "[SSUB] Urgent Account Alert [ESUB] [SBODY] Click [LINK] to verify your account. [EBODY]"
embedding = get_cls_embedding(text, model, tokenizer, device)

print("Embedding shape:", embedding.shape)
print("First 10 dimensions:", embedding[:10])
````

---

## Training Details

### Training Data

The model was trained using well-known phishing and email security datasets, including **CEAS**, combined with additional curated CSV sources.

### Data Preprocessing

1. Cleaned and merged multiple CSV datasets
2. Replaced:

   * URLs → `[LINK]`
   * Phone numbers → `[PHONE]`
3. Combined subject and body using structural tokens:

   * `[SSUB]`, `[ESUB]`, `[SBODY]`, `[EBODY]`

### Training Hyperparameters

```python
training_args = TrainingArguments(
    output_dir="./distilbert_safe_suspicious",
    eval_strategy="steps",
    eval_steps=50,
    save_strategy="steps",
    save_steps=50,
    save_total_limit=3,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    learning_rate=4e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=8,
    num_train_epochs=4,
    weight_decay=0.01,
    logging_strategy="steps",
    logging_steps=50,
    seed=42,
)
```

---

## Evaluation

### Evaluation Metrics

* Accuracy
* F1 Score

### Testing Setup

* 10% held-out test split from the full dataset

### Results

* **DistilBERT (standalone):** Strong classification performance
* **DistilBERT embeddings + XGBoost + URL features:**
  **99.4% accuracy**

![Evaluation Result](https://cdn-uploads.huggingface.co/production/uploads/685998a37db0a027171ecb9f/Dr3okP_bmVOxHgeaqIQDM.png)

---

## Technical Specifications

### Model Architecture

* DistilBERT encoder
* Sequence classification head
* CLS-token embedding extraction supported

### Compute Infrastructure

* **Hardware:** NVIDIA T4 GPU
* **Frameworks:** PyTorch, Hugging Face Transformers

---

## Environmental Impact

Carbon emissions were not explicitly measured.
Users may estimate emissions using the Machine Learning Impact Calculator if needed.

---

## Model Card Authors

* **Atharva Gaykar**

---

## Contact

For questions, feedback, or research collaboration, please reach out via the Hugging Face model repository.

---