Model Card for PhishingDistilBERT

Model Summary

PhishingDistilBERT is a DistilBERT-based NLP model fine-tuned specifically for email understanding tasks, particularly phishing and suspicious email detection.
The model introduces custom special tokens to explicitly encode email structure such as subject, body, links, and phone numbers, making it more robust for email-based security applications.

It can be used both as:

a sequence classification model for email safety detection, and
an embedding generator for downstream ML pipelines (e.g., XGBoost).

Model Details

Model Description

This model is fine-tuned from distilbert-base-uncased on curated email datasets. During preprocessing, email-specific entities such as URLs and phone numbers are replaced with dedicated tokens, and the subject and body are explicitly separated using structural markers.

Special Tokens Used

[SSUB], [ESUB] – Start/End of Subject
[SBODY], [EBODY] – Start/End of Body
[LINK] – URLs
[PHONE] – Phone numbers

These design choices help the model better learn semantic and structural patterns commonly found in phishing emails.

Developed by: Atharva Gaykar
Model type: Transformer-based text classification & embedding model
Language: English
License: Artistic-2.0
Finetuned from: distilbert/distilbert-base-uncased

Intended Uses

Primary Use Cases

Phishing email classification
Suspicious vs safe email detection
Feature extraction for traditional ML models
Email embedding generation for downstream classifiers

Out-of-Scope Uses

Non-text email analysis (images, attachments)
Commercial deployment without proper evaluation and compliance
Tasks unrelated to email or message-level text analysis

Bias, Risks, and Limitations

The model is trained on public phishing datasets and may reflect biases present in those sources.
Performance may degrade on highly obfuscated or novel phishing techniques.
Not recommended for direct commercial use without extensive validation.

Users should carefully evaluate the model in their target environment before deployment.

How to Get Started

from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch
import numpy as np

bert_path = "Gaykar/PhishingDistilBERT"

tokenizer = DistilBertTokenizerFast.from_pretrained(bert_path)
model = DistilBertForSequenceClassification.from_pretrained(bert_path)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

def get_cls_embedding(text, model, tokenizer, device):
    with torch.no_grad():
        inputs = tokenizer(
            text,
            return_tensors="pt",
            truncation=True,
            padding=True,
            max_length=256
        )
        inputs = {k: v.to(device) for k, v in inputs.items()}
        outputs = model.distilbert(**inputs)
        cls_embedding = outputs.last_hidden_state[:, 0, :].squeeze().cpu().numpy()
    return cls_embedding

text = "[SSUB] Urgent Account Alert [ESUB] [SBODY] Click [LINK] to verify your account. [EBODY]"
embedding = get_cls_embedding(text, model, tokenizer, device)

print("Embedding shape:", embedding.shape)
print("First 10 dimensions:", embedding[:10])

Training Details

Training Data

The model was trained using well-known phishing and email security datasets, including CEAS, combined with additional curated CSV sources.

Data Preprocessing

Cleaned and merged multiple CSV datasets
Replaced:
- URLs → [LINK]
- Phone numbers → [PHONE]
Combined subject and body using structural tokens:
- [SSUB], [ESUB], [SBODY], [EBODY]

Training Hyperparameters

training_args = TrainingArguments(
    output_dir="./distilbert_safe_suspicious",
    eval_strategy="steps",
    eval_steps=50,
    save_strategy="steps",
    save_steps=50,
    save_total_limit=3,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    learning_rate=4e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=8,
    num_train_epochs=4,
    weight_decay=0.01,
    logging_strategy="steps",
    logging_steps=50,
    seed=42,
)

Evaluation

Evaluation Metrics

Accuracy
F1 Score

Testing Setup

10% held-out test split from the full dataset

Results

DistilBERT (standalone): Strong classification performance
DistilBERT embeddings + XGBoost + URL features: 99.4% accuracy

Technical Specifications

Model Architecture

DistilBERT encoder
Sequence classification head
CLS-token embedding extraction supported

Compute Infrastructure

Hardware: NVIDIA T4 GPU
Frameworks: PyTorch, Hugging Face Transformers

Environmental Impact

Carbon emissions were not explicitly measured. Users may estimate emissions using the Machine Learning Impact Calculator if needed.

Model Card Authors

Atharva Gaykar

Contact

For questions, feedback, or research collaboration, please reach out via the Hugging Face model repository.

Downloads last month: 23

Safetensors

Model size

67M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Gaykar/PhishingDistilBERT

Base model

distilbert/distilbert-base-uncased

Finetuned

(11579)

this model

Gaykar
/

PhishingDistilBERT