Perth0603
/

Random-Forest-Model-for-PhishingDetection

Perth0603 commited on Oct 1, 2025

Commit

3d48ccd

verified ·

1 Parent(s): b0ecf99

Upload model_card.md with huggingface_hub

Files changed (1) hide show

model_card.md ADDED Viewed

+---
+language: en
+license: other
+tags:
+  - security
+  - phishing-detection
+  - url-classification
+  - xgboost
+---
+# Random Forest / XGBoost Model for URL Phishing Detection
+## Model Details
+- Architecture: Gradient-boosted decision trees (XGBoost)
+- Input: Single URL string (no external queries)
+- Features: Lexical and structural URL features (lengths, symbol counts, digit ratio, IPv4 pattern, common phishing tokens, scheme/TLD heuristics)
+- Training data: `PhiUSIIL_Phishing_URL_Dataset.csv`
+- Intended use: Binary classification (phishing vs. legitimate)
+## Metrics (test)
+- Accuracy: 0.9952
+- Precision: 0.9928
+- Recall: 0.9989
+- F1: 0.9958
+- ROC-AUC: 0.9976
+## Usage
+See `README.md` and `inference.py` for loading and `predict_url()`.
+## Limitations and Biases
+- URL-only features can be evaded by sophisticated attackers.
+- Dataset shifts and novel TLDs may degrade performance.
+- Always validate on your own traffic before deployment.
+## License
+Provided for research/educational purposes. Ensure compliance with local laws and organizational policies.