karthik-2905 commited on
Commit
ac64500
·
verified ·
1 Parent(s): 7ac8046

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ # Anomaly Detection Suite
6
+
7
+ This repository hosts a comprehensive project on anomaly detection, evaluating and comparing multiple algorithms on a synthetic dataset. It includes the implementation notebook, trained models, results, and visualizations.
8
+
9
+ ## Project Overview
10
+
11
+ This project provides a hands-on guide to identifying outliers using the following methods:
12
+
13
+ - **Statistical Methods (Z-score)**
14
+ - **Isolation Forest**
15
+ - **One-Class SVM**
16
+ - **Local Outlier Factor (LOF)**
17
+ - **Autoencoder (Deep Learning)**
18
+
19
+ The goal is to provide a clear comparison of how these different techniques perform on the same dataset.
20
+
21
+ ## Repository Contents
22
+
23
+ - `implementation.ipynb`: The main Jupyter notebook with all the code and explanations.
24
+ - `anomaly_detection_results/`: A directory containing all the generated files:
25
+ - Trained models for each algorithm.
26
+ - Anomaly scores and predictions.
27
+ - Performance metrics and results in JSON format.
28
+ - Visualizations comparing the different methods.
29
+
30
+ ## How to Use the Models
31
+
32
+ The trained models are saved in the `anomaly_detection_results/` directory. You can load them to make predictions on new data. For example, to load the Isolation Forest model:
33
+
34
+ ```python
35
+ import pickle
36
+
37
+ with open('anomaly_detection_results/isolation_forest_model.pkl', 'rb') as f:
38
+ model = pickle.load(f)
39
+
40
+ # Now you can use the model to predict on new data
41
+ # predictions = model.predict(new_data)
42
+ ```
43
+
44
+ ## Dataset
45
+
46
+ The dataset is synthetically generated within the `implementation.ipynb` notebook. It consists of two-dimensional data with a clear cluster of normal points and a few scattered outliers.