Url Phishing Classifier
This model is fine-tuned for URL phishing classification. It classifies URLs as phishing (1) or safe (0).
Model Description
This model is based on roberta-base and has been fine-tuned for phishing detection tasks.
Training Details
- Base Model: roberta-base
- Training Samples: 1629193
- Validation Samples: 325839
- Test Samples: 217226
- Epochs: 5
- Batch Size: 24
- Learning Rate: 2e-05
- Max Length: 256
Evaluation Results
Test Set Metrics
- Loss: 0.1483
- Accuracy: 0.9463
- F1: 0.9262
- Precision: 0.9259
- Recall: 0.9264
- Roc Auc: 0.9890
- True Positives: 73116.0000
- True Negatives: 132450.0000
- False Positives: 5851.0000
- False Negatives: 5809.0000
- Runtime: 142.5284
- Samples Per Second: 1524.0900
- Steps Per Second: 31.7550
- Epoch: 5.0000
Validation Set Metrics
- Loss: 0.1483
- Accuracy: 0.9455
- F1: 0.9250
- Precision: 0.9246
- Recall: 0.9255
- Roc Auc: 0.9888
- True Positives: 109566.0000
- True Negatives: 198511.0000
- False Positives: 8940.0000
- False Negatives: 8822.0000
- Runtime: 195.9861
- Samples Per Second: 1662.5610
- Steps Per Second: 34.6400
- Epoch: 5.0000
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "nhellyercreek/url-phishing-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example inference
text = "Your email or URL text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
# Get prediction
predicted_class = predictions.argmax().item()
confidence = predictions[0][predicted_class].item()
print(f"Predicted class: {predicted_class} (phishing=1, safe=0)")
print(f"Confidence: {confidence:.4f}")
Limitations
This model was trained on specific datasets and may not generalize to all types of phishing attempts. Always use additional security measures in production environments.
Citation
If you use this model, please cite:
@misc{nhellyercreek_url_phishing_classifier,
title={Url Phishing Classifier},
author={Your Name},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/nhellyercreek/url-phishing-classifier}}
}
- Downloads last month
- -