Email Phishing Detector V2

This model is fine-tuned for email phishing detection. It classifies emails as phishing (1) or safe (0).

Model Description

This model is based on microsoft/deberta-v3-base and has been fine-tuned for phishing detection tasks.

Training Details

  • Base Model: microsoft/deberta-v3-base
  • Training Samples: 37395
  • Validation Samples: 7479
  • Test Samples: 4986
  • Epochs: 5
  • Batch Size: 14
  • Learning Rate: 2e-05
  • Max Length: 512
  • Loss Type: focal
  • LR Scheduler: cosine

Additional Training Parameters

  • Augmentation: False

Evaluation Results

Test Set Metrics

  • Loss: 0.0044
  • Accuracy: 0.9966
  • F1: 0.9970
  • Precision: 0.9975
  • Recall: 0.9965
  • Roc Auc: 1.0000
  • Mcc: 0.9930
  • True Positives: 2836.0000
  • True Negatives: 2133.0000
  • False Positives: 7.0000
  • False Negatives: 10.0000
  • Runtime: 94.4784
  • Samples Per Second: 52.7740
  • Steps Per Second: 3.7790
  • Epoch: 5.0000

Validation Set Metrics

  • Loss: 0.0050
  • Accuracy: 0.9965
  • F1: 0.9969
  • Precision: 0.9986
  • Recall: 0.9953
  • Roc Auc: 0.9999
  • Mcc: 0.9929
  • True Positives: 4248.0000
  • True Negatives: 3205.0000
  • False Positives: 6.0000
  • False Negatives: 20.0000
  • Runtime: 112.4284
  • Samples Per Second: 66.5220
  • Steps Per Second: 4.7590
  • Epoch: 5.0000

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "nhellyercreek/email-phishing-detector-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example inference
text = "Your email or URL text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Get prediction
predicted_class = predictions.argmax().item()
confidence = predictions[0][predicted_class].item()

print(f"Predicted class: {predicted_class} (phishing=1, safe=0)")
print(f"Confidence: {confidence:.4f}")

Limitations

This model was trained on specific datasets and may not generalize to all types of phishing attempts. Always use additional security measures in production environments.

Citation

If you use this model, please cite:

@misc{nhellyercreek_email_phishing_detector_v2,
  title={Email Phishing Detector V2},
  author={Your Name},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/nhellyercreek/email-phishing-detector-v2}}
}
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support