Upload folder using huggingface_hub

Browse files

Files changed (13) hide show

README.md +215 -0
adaboost.pkl +3 -0
decision_tree.pkl +3 -0
extra_trees_classifier.pkl +3 -0
gaussian_naive_bayes.pkl +3 -0
gradient_boosting_classifier.pkl +3 -0
label_encoder.pkl +3 -0
logistic_regression.pkl +3 -0
mlp_classifier.pkl +3 -0
random_forest.pkl +3 -0
scaler.pkl +3 -0
sgd_classifier.pkl +3 -0
xgboost.pkl +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,215 @@

+# Malicious URL Detection Models
+This directory contains trained machine learning models for detecting malicious URLs. The models are trained to classify URLs into four categories:
+- **benign**
+- **defacement**
+- **malware**
+- **phishing**
+## Model Performance Summary
+The following table summarizes the accuracy of each model on the test dataset:
+| Model | Accuracy |
+|-------|----------|
+| **Extra Trees Classifier** | **97%** |
+| **Random Forest** | **97%** |
+| **Decision Tree** | **96%** |
+| **MLP Classifier** | **96%** |
+| **XGBoost** | **96%** |
+| **Gradient Boosting Classifier** | **94%** |
+| **Logistic Regression** | **87%** |
+| **SGD Classifier** | **87%** |
+| **Adaboost** | **85%** |
+| **Gaussian Naive Bayes** | **80%** |
+## Detailed Performance Reports
+### Adaboost
+- **Accuracy:** 0.85
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.90      0.97      0.93     85778
+  defacement       0.82      0.76      0.79     19104
+     malware       0.55      0.74      0.63      6521
+    phishing       0.68      0.42      0.52     18836
+    accuracy                           0.85    130239
+   macro avg       0.74      0.72      0.72    130239
+weighted avg       0.84      0.85      0.84    130239
+```
+### Decision Tree
+- **Accuracy:** 0.96
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.97      0.98      0.98     85778
+  defacement       0.98      0.99      0.98     19104
+     malware       0.95      0.94      0.95      6521
+    phishing       0.87      0.85      0.86     18836
+    accuracy                           0.96    130239
+   macro avg       0.95      0.94      0.94    130239
+weighted avg       0.96      0.96      0.96    130239
+```
+### Extra Trees Classifier
+- **Accuracy:** 0.97
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.97      0.98      0.98     85778
+  defacement       0.98      0.99      0.99     19104
+     malware       0.98      0.94      0.96      6521
+    phishing       0.91      0.86      0.88     18836
+    accuracy                           0.97    130239
+   macro avg       0.96      0.95      0.95    130239
+weighted avg       0.97      0.97      0.97    130239
+```
+### Gaussian Naive Bayes
+- **Accuracy:** 0.80
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.86      0.90      0.88     85778
+  defacement       0.67      0.99      0.80     19104
+     malware       0.63      0.69      0.66      6521
+    phishing       0.68      0.19      0.29     18836
+    accuracy                           0.80    130239
+   macro avg       0.71      0.69      0.66    130239
+weighted avg       0.80      0.80      0.77    130239
+```
+### Gradient Boosting Classifier
+- **Accuracy:** 0.94
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.96      0.99      0.97     85778
+  defacement       0.92      0.97      0.94     19104
+     malware       0.94      0.80      0.87      6521
+    phishing       0.89      0.78      0.83     18836
+    accuracy                           0.94    130239
+   macro avg       0.93      0.88      0.90    130239
+weighted avg       0.94      0.94      0.94    130239
+```
+### Logistic Regression
+- **Accuracy:** 0.87
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.89      0.97      0.93     85778
+  defacement       0.85      0.95      0.90     19104
+     malware       0.81      0.69      0.74      6521
+    phishing       0.77      0.42      0.55     18836
+    accuracy                           0.87    130239
+   macro avg       0.83      0.76      0.78    130239
+weighted avg       0.87      0.87      0.86    130239
+```
+### MLP Classifier
+- **Accuracy:** 0.96
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.97      0.98      0.98     85778
+  defacement       0.97      0.97      0.97     19104
+     malware       0.95      0.90      0.92      6521
+    phishing       0.88      0.83      0.86     18836
+    accuracy                           0.96    130239
+   macro avg       0.94      0.92      0.93    130239
+weighted avg       0.96      0.96      0.96    130239
+```
+### Random Forest
+- **Accuracy:** 0.97
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.98      0.98      0.98     85778
+  defacement       0.98      0.99      0.99     19104
+     malware       0.98      0.94      0.96      6521
+    phishing       0.91      0.87      0.89     18836
+    accuracy                           0.97    130239
+   macro avg       0.96      0.95      0.95    130239
+weighted avg       0.97      0.97      0.97    130239
+```
+### SGD Classifier
+- **Accuracy:** 0.87
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.89      0.96      0.93     85778
+  defacement       0.83      0.95      0.89     19104
+     malware       0.79      0.71      0.75      6521
+    phishing       0.74      0.40      0.52     18836
+    accuracy                           0.87    130239
+   macro avg       0.81      0.76      0.77    130239
+weighted avg       0.86      0.87      0.85    130239
+```
+### XGBoost
+- **Accuracy:** 0.96
+- **Report:**
+```
+              precision    recall  f1-score   support
+      benign       0.97      0.99      0.98     85778
+  defacement       0.97      0.99      0.98     19104
+     malware       0.98      0.92      0.95      6521
+    phishing       0.91      0.84      0.88     18836
+    accuracy                           0.96    130239
+   macro avg       0.96      0.93      0.95    130239
+weighted avg       0.96      0.96      0.96    130239
+```
+## Usage
+To load a model in Python, you can use `joblib` or `pickle`.
+### Using joblib
+```python
+import joblib
+# Load the model
+model = joblib.load('models/random_forest.pkl')
+# Make predictions
+prediction = model.predict(X_test)
+```
+### Using pickle
+```python
+import pickle
+# Load the model
+with open('models/random_forest.pkl', 'rb') as f:
+    model = pickle.load(f)
+# Make predictions
+prediction = model.predict(X_test)
+```

adaboost.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:55ccb71299f2573836929219945e1b244a9c09ffce9b66ab0fc70a5beaffcdd4
+size 36628

decision_tree.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8f9ada3533633eb37065776f66800ed024b43093ff3d2cd4b82b1a7617976fd6
+size 4526681

extra_trees_classifier.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fcab25db6b4e0a7f49bfd5678dac9e566f722b865172fd3d8999a4fd6c8e9cce
+size 852962137

gaussian_naive_bayes.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8ddcd8cfd42f2b0ea6d23df58325e8035a7e3fcfccfb402d181e90d61b44c241
+size 2551

gradient_boosting_classifier.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:44c1cdb5d3d4cea5ca739afda082edecb8e792c566603aa9c117ec4ef8cff8a5
+size 530569

label_encoder.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1f1afcf02394ae87fa6c1c299b874785530ece909b53caa0e5e70e9355a041c7
+size 516

logistic_regression.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:45088fc709d21bec582de00df39a8416663087d2d363cd9d4998527b85cda631
+size 1767

mlp_classifier.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1d7acc5b2ee69c411e760b0cdf203e55454e64aff6d621d4ccc00fe1e57c73fe
+size 113424

random_forest.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1eec6d2fd709214916e65577cdb61de235a5051e8fb2b3755cd616e012e4e1f0
+size 434828169

scaler.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd0bf48348e7e28ace5be28f02f11450435f3ec472e47ad4e98576b903e41632
+size 1791

sgd_classifier.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0ac2dd599bbc464c98901bde157d8a7b8873d5e65f0828835d76abdf11c19126
+size 2013

xgboost.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fda61cdd809c4edeefe573e33b7587d2682a9c572c46e0a887438a652fdbd456
+size 1375414