Model Card for PhishingDistilBERT
Model Summary
PhishingDistilBERT is a DistilBERT-based NLP model fine-tuned specifically for email understanding tasks, particularly phishing and suspicious email detection.
The model introduces custom special tokens to explicitly encode email structure such as subject, body, links, and phone numbers, making it more robust for email-based security applications.
It can be used both as:
- a sequence classification model for email safety detection, and
- an embedding generator for downstream ML pipelines (e.g., XGBoost).
Model Details
Model Description
This model is fine-tuned from distilbert-base-uncased on curated email datasets. During preprocessing, email-specific entities such as URLs and phone numbers are replaced with dedicated tokens, and the subject and body are explicitly separated using structural markers.
Special Tokens Used
[SSUB],[ESUB]โ Start/End of Subject[SBODY],[EBODY]โ Start/End of Body[LINK]โ URLs[PHONE]โ Phone numbers
These design choices help the model better learn semantic and structural patterns commonly found in phishing emails.
- Developed by: Atharva Gaykar
- Model type: Transformer-based text classification & embedding model
- Language: English
- License: Artistic-2.0
- Finetuned from: distilbert/distilbert-base-uncased
Intended Uses
Primary Use Cases
- Phishing email classification
- Suspicious vs safe email detection
- Feature extraction for traditional ML models
- Email embedding generation for downstream classifiers
Out-of-Scope Uses
- Non-text email analysis (images, attachments)
- Commercial deployment without proper evaluation and compliance
- Tasks unrelated to email or message-level text analysis
Bias, Risks, and Limitations
- The model is trained on public phishing datasets and may reflect biases present in those sources.
- Performance may degrade on highly obfuscated or novel phishing techniques.
- Not recommended for direct commercial use without extensive validation.
Users should carefully evaluate the model in their target environment before deployment.
How to Get Started
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch
import numpy as np
bert_path = "Gaykar/PhishingDistilBERT"
tokenizer = DistilBertTokenizerFast.from_pretrained(bert_path)
model = DistilBertForSequenceClassification.from_pretrained(bert_path)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()
def get_cls_embedding(text, model, tokenizer, device):
with torch.no_grad():
inputs = tokenizer(
text,
return_tensors="pt",
truncation=True,
padding=True,
max_length=256
)
inputs = {k: v.to(device) for k, v in inputs.items()}
outputs = model.distilbert(**inputs)
cls_embedding = outputs.last_hidden_state[:, 0, :].squeeze().cpu().numpy()
return cls_embedding
text = "[SSUB] Urgent Account Alert [ESUB] [SBODY] Click [LINK] to verify your account. [EBODY]"
embedding = get_cls_embedding(text, model, tokenizer, device)
print("Embedding shape:", embedding.shape)
print("First 10 dimensions:", embedding[:10])
Training Details
Training Data
The model was trained using well-known phishing and email security datasets, including CEAS, combined with additional curated CSV sources.
Data Preprocessing
Cleaned and merged multiple CSV datasets
Replaced:
- URLs โ
[LINK] - Phone numbers โ
[PHONE]
- URLs โ
Combined subject and body using structural tokens:
[SSUB],[ESUB],[SBODY],[EBODY]
Training Hyperparameters
training_args = TrainingArguments(
output_dir="./distilbert_safe_suspicious",
eval_strategy="steps",
eval_steps=50,
save_strategy="steps",
save_steps=50,
save_total_limit=3,
load_best_model_at_end=True,
metric_for_best_model="eval_loss",
greater_is_better=False,
learning_rate=4e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=8,
num_train_epochs=4,
weight_decay=0.01,
logging_strategy="steps",
logging_steps=50,
seed=42,
)
Evaluation
Evaluation Metrics
- Accuracy
- F1 Score
Testing Setup
- 10% held-out test split from the full dataset
Results
- DistilBERT (standalone): Strong classification performance
- DistilBERT embeddings + XGBoost + URL features: 99.4% accuracy
Technical Specifications
Model Architecture
- DistilBERT encoder
- Sequence classification head
- CLS-token embedding extraction supported
Compute Infrastructure
- Hardware: NVIDIA T4 GPU
- Frameworks: PyTorch, Hugging Face Transformers
Environmental Impact
Carbon emissions were not explicitly measured. Users may estimate emissions using the Machine Learning Impact Calculator if needed.
Model Card Authors
- Atharva Gaykar
Contact
For questions, feedback, or research collaboration, please reach out via the Hugging Face model repository.
- Downloads last month
- 66
Model tree for Gaykar/PhishingDistilBERT
Base model
distilbert/distilbert-base-uncased