Email Phishing Detector V2

This model is fine-tuned for email phishing detection. It classifies emails as phishing (1) or safe (0).

Model Description

This model is based on microsoft/deberta-v3-base and has been fine-tuned for phishing detection tasks.

Training Details

Base Model: microsoft/deberta-v3-base
Training Samples: 37395
Validation Samples: 7479
Test Samples: 4986
Epochs: 5
Batch Size: 14
Learning Rate: 2e-05
Max Length: 512
Loss Type: focal
LR Scheduler: cosine

Additional Training Parameters

Augmentation: False

Evaluation Results

Test Set Metrics

Loss: 0.0044
Accuracy: 0.9966
F1: 0.9970
Precision: 0.9975
Recall: 0.9965
Roc Auc: 1.0000
Mcc: 0.9930
True Positives: 2836.0000
True Negatives: 2133.0000
False Positives: 7.0000
False Negatives: 10.0000
Runtime: 94.4784
Samples Per Second: 52.7740
Steps Per Second: 3.7790
Epoch: 5.0000

Validation Set Metrics

Loss: 0.0050
Accuracy: 0.9965
F1: 0.9969
Precision: 0.9986
Recall: 0.9953
Roc Auc: 0.9999
Mcc: 0.9929
True Positives: 4248.0000
True Negatives: 3205.0000
False Positives: 6.0000
False Negatives: 20.0000
Runtime: 112.4284
Samples Per Second: 66.5220
Steps Per Second: 4.7590
Epoch: 5.0000

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "nhellyercreek/email-phishing-detector-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example inference
text = "Your email or URL text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Get prediction
predicted_class = predictions.argmax().item()
confidence = predictions[0][predicted_class].item()

print(f"Predicted class: {predicted_class} (phishing=1, safe=0)")
print(f"Confidence: {confidence:.4f}")

Limitations

This model was trained on specific datasets and may not generalize to all types of phishing attempts. Always use additional security measures in production environments.

Citation

If you use this model, please cite:

@misc{nhellyercreek_email_phishing_detector_v2,
  title={Email Phishing Detector V2},
  author={Your Name},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/nhellyercreek/email-phishing-detector-v2}}
}

Downloads last month: 3

Safetensors

Model size

0.2B params

Tensor type

F32