|
|
--- |
|
|
datasets: |
|
|
- zefang-liu/phishing-email-dataset |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
library_name: transformers |
|
|
tags: |
|
|
- phishing |
|
|
- email |
|
|
- detection |
|
|
- scam |
|
|
--- |
|
|
# BERT Model for Phishing Detection |
|
|
|
|
|
This repository contains the fine-tuned **BERT model** for detecting phishing emails. The model has been trained to classify emails as either **phishing** or **legitimate** based on their body text. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model Type**: BERT (Bidirectional Encoder Representations from Transformers) |
|
|
- **Task**: Phishing detection (Binary classification) |
|
|
- **Fine-Tuning**: The model was fine-tuned on a dataset of phishing and legitimate emails. |
|
|
|
|
|
## How to Use |
|
|
|
|
|
1. **Install Dependencies**: |
|
|
You can use the following command to install the necessary libraries: |
|
|
```bash |
|
|
pip install transformers torch |
|
|
|
|
|
2. **Load Model**: |
|
|
```bash |
|
|
from transformers import BertForSequenceClassification, BertTokenizer |
|
|
import torch |
|
|
|
|
|
# Replace with your Hugging Face model repo name |
|
|
model_name = 'ElSlay/BERT-Phishing-Email-Model' |
|
|
|
|
|
# Load the pre-trained model and tokenizer |
|
|
model = BertForSequenceClassification.from_pretrained(model_name) |
|
|
tokenizer = BertTokenizer.from_pretrained(model_name) |
|
|
|
|
|
# Ensure the model is in evaluation mode |
|
|
model.eval() |
|
|
|
|
|
3. **Use the model for Prediction**: |
|
|
```bash |
|
|
# Input email text |
|
|
email_text = "Your email content here" |
|
|
|
|
|
# Tokenize and preprocess the input text |
|
|
inputs = tokenizer(email_text, return_tensors="pt", truncation=True, padding='max_length', max_length=512) |
|
|
|
|
|
# Make the prediction |
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
logits = outputs.logits |
|
|
predictions = torch.argmax(logits, dim=-1) |
|
|
|
|
|
# Interpret the prediction |
|
|
result = "Phishing" if predictions.item() == 1 else "Legitimate" |
|
|
print(f"Prediction: {result}") |
|
|
|
|
|
4. **Expected Outputs**: |
|
|
1: Phishing |
|
|
0: Legitimate |