File size: 2,383 Bytes
f8b9313
5b93b6c
 
 
 
 
f8b9313
 
5b93b6c
 
 
0c23d98
5b93b6c
 
 
0c23d98
5b93b6c
0c23d98
 
5b93b6c
 
 
 
 
0c23d98
5b93b6c
 
 
 
0c23d98
5b93b6c
0c23d98
 
 
 
 
5b93b6c
0c23d98
5b93b6c
 
 
0c23d98
 
 
 
 
 
5b93b6c
 
 
0c23d98
 
 
 
5b93b6c
 
 
0c23d98
 
5b93b6c
0c23d98
5b93b6c
 
 
0c23d98
5b93b6c
0c23d98
 
 
 
5b93b6c
f8b9313
 
5b93b6c
 
0c23d98
5b93b6c
 
f8b9313
5b93b6c
 
f8b9313
5b93b6c
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
title: NSL-KDD Anomaly Detection
emoji: 🧠
colorFrom: blue
colorTo: green
sdk: streamlit
app_port: 8501
pinned: false
tags:
  - anomaly-detection
  - machine-learning
  - streamlit
  - cybersecurity
---

# 🧠 NSL-KDD Anomaly Detection (Isolation Forest vs One-Class SVM vs Logistic Regression)

This Streamlit web app performs **network intrusion detection** using the **NSL-KDD dataset**,  
comparing both **unsupervised** and **supervised** machine learning algorithms.

---

## 🌐 Live Demo
πŸ‘‰ Try it on **Hugging Face Spaces**:  
(Replace this link with your Space URL after deployment)

---

## 🧠 Overview
The app demonstrates anomaly detection for cybersecurity by comparing three models:

| Model | Type | Description |
|--------|------|-------------|
| **Isolation Forest** | Unsupervised | Detects anomalies by isolating data points |
| **One-Class SVM** | Unsupervised | Learns a decision boundary around normal data |
| **Logistic Regression** | Supervised | Classifies known attacks using labeled data |

The dataset comes from the **NSL-KDD** benchmark β€” an improved version of the KDD Cup '99 dataset.

---

## βš™οΈ Workflow
1. Load the NSL-KDD dataset from an online source  
2. One-hot encode categorical columns (`protocol_type`, `service`, `flag`)  
3. Scale all numeric features with `StandardScaler`  
4. Train and compare all three models  
5. Display metrics and confusion matrices in Streamlit

---

## πŸ“Š Example Outputs
- **Model Performance Table** – Accuracy, Precision, Recall, F1-score  
- **Confusion Matrices** – For all three models  
- **Visual Insights** – Easy comparison between supervised and unsupervised models  

---

## 🧾 Dataset
Source: [University of New Brunswick – NSL-KDD Dataset](https://www.unb.ca/cic/datasets/nsl.html)

It contains 41 features describing network traffic and a label (`normal` or `attack`).

---

## 🧰 Tech Stack
- **Python 3.9+**
- **Streamlit** – Web framework  
- **Scikit-learn** – ML algorithms & preprocessing  
- **Matplotlib / Seaborn** – Visualization  
- **Pandas / NumPy** – Data handling  

---

## πŸ—οΈ Local Setup (Optional)
```bash
# Clone the repo
git clone https://huggingface.co/spaces/your-username/nsl-kdd-anomaly-detection
cd nsl-kdd-anomaly-detection

# Install dependencies
pip install -r requirements.txt

# Run Streamlit app
streamlit run app.py