|
|
--- |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# Anomaly Detection Suite |
|
|
|
|
|
This repository hosts a comprehensive project on anomaly detection, evaluating and comparing multiple algorithms on a synthetic dataset. It includes the implementation notebook, trained models, results, and visualizations. |
|
|
|
|
|
## Project Overview |
|
|
|
|
|
This project provides a hands-on guide to identifying outliers using the following methods: |
|
|
|
|
|
- **Statistical Methods (Z-score)** |
|
|
- **Isolation Forest** |
|
|
- **One-Class SVM** |
|
|
- **Local Outlier Factor (LOF)** |
|
|
- **Autoencoder (Deep Learning)** |
|
|
|
|
|
The goal is to provide a clear comparison of how these different techniques perform on the same dataset. |
|
|
|
|
|
## Repository Contents |
|
|
|
|
|
- `implementation.ipynb`: The main Jupyter notebook with all the code and explanations. |
|
|
- `anomaly_detection_results/`: A directory containing all the generated files: |
|
|
- Trained models for each algorithm. |
|
|
- Anomaly scores and predictions. |
|
|
- Performance metrics and results in JSON format. |
|
|
- Visualizations comparing the different methods. |
|
|
|
|
|
## How to Use the Models |
|
|
|
|
|
The trained models are saved in the `anomaly_detection_results/` directory. You can load them to make predictions on new data. For example, to load the Isolation Forest model: |
|
|
|
|
|
```python |
|
|
import pickle |
|
|
|
|
|
with open('anomaly_detection_results/isolation_forest_model.pkl', 'rb') as f: |
|
|
model = pickle.load(f) |
|
|
|
|
|
# Now you can use the model to predict on new data |
|
|
# predictions = model.predict(new_data) |
|
|
``` |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The dataset is synthetically generated within the `implementation.ipynb` notebook. It consists of two-dimensional data with a clear cluster of normal points and a few scattered outliers. |
|
|
|