Phishing BERT Model

A fine-tuned BERT (bert-base-uncased) classifier for phishing email detection, developed as part of the NTUST MI5137701 course term project (Group 72).

Model Description

This model classifies email text into two categories:

  • phishing email (label 1)
  • safe email (label 0)

The model is trained on body-only email text (headers stripped) to avoid learning dataset artifacts rather than genuine phishing indicators.

Training Details

Parameter Value
Base model bert-base-uncased (110M params)
Architecture BertForSequenceClassification
Labels 2 (phishing email, safe email)
Learning rate 2e-5
Optimizer AdamW
Loss Cross-entropy
Epochs 3
Max sequence length 512 tokens
Training dataset drorrabin/phishing_emails-data (26,946 emails, 50/50 split)
Random seed 42

Evaluation Results (Cross-Corpus)

All results are evaluated on a disjoint test corpus (Phishing_Email.csv, 18,460 emails) that has zero overlap with the training data.

BERT Standalone

Metric Value
Accuracy 80.71%
F1 (macro) 71.22%
Recall 60.43%
Precision 86.70%
ROC-AUC 0.919
ECE 0.178

Cascade (BERT + Qwen 2.5 7B CoT)

Metric Value
Accuracy 81.07%
F1 (macro) 71.94%
Recall 61.45%
Precision 86.75%
Escalation rate 1.21%
Amortized latency ~0.36s/email

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("harrynguyen5/phishing-bert-model")
model = AutoModelForSequenceClassification.from_pretrained("harrynguyen5/phishing-bert-model")

email_text = "Dear user, your account has been compromised. Click here to verify."
inputs = tokenizer(email_text, return_tensors="pt", truncation=True, max_length=512, padding=True)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    prediction = torch.argmax(probs, dim=-1).item()

label = model.config.id2label[prediction]
confidence = probs[0][prediction].item()
print(f"Prediction: {label} (confidence: {confidence:.4f})")

Important Notes

  • Body-only input: This model is trained on email body text only. Headers, subject lines, and metadata should be stripped before inference.
  • Cross-corpus evaluation: In-distribution accuracy (99.76%) is artificially inflated due to a dataset artifact (ceas-challenge header). The cross-corpus results above are the reliable benchmark.
  • Calibration: ECE = 0.178 indicates the model's confidence scores do not reflect true probabilities. Post-hoc calibration is recommended before using confidence thresholds.

Team

Member Student ID Role
Bui The Hien M11409806 Dataset & Fine-Tuning
Le Trung Kien M11415803 Evaluation & Benchmarking
Nguyen Quoc Nguyen M11409814 Prompt Pipeline & Logic

Links

Downloads last month
76
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train harrynguyen5/phishing-bert-model

Evaluation results

  • accuracy on Cross-corpus test set (Phishing_Email.csv)
    self-reported
    0.807
  • f1 on Cross-corpus test set (Phishing_Email.csv)
    self-reported
    0.712
  • roc_auc on Cross-corpus test set (Phishing_Email.csv)
    self-reported
    0.919