| # Clustering Algorithms for Customer Segmentation | |
| This repository contains a comprehensive implementation of various clustering algorithms to perform customer segmentation on a synthetic dataset. The project explores K-Means, Hierarchical Clustering, DBSCAN, and Gaussian Mixture Models (GMM) to identify distinct customer groups based on age and income. | |
| ## Project Structure | |
| - `implementation.ipynb`: The main Jupyter notebook containing the entire analysis, from data generation to model evaluation and visualization. | |
| - `data/`: Contains the synthetic `customer_data.csv` file. | |
| - `models/`: Stores the trained clustering models and the data scaler. | |
| - `results/`: Includes the algorithm comparison, detailed analysis, and experiment summary. | |
| - `visualizations/`: Contains the output plots, such as the elbow method analysis and cluster comparisons. | |
| ## Features | |
| - **Data Generation**: A synthetic customer dataset is generated with clear cluster structures for effective model training and evaluation. | |
| - **Multiple Algorithms**: Implements and compares four popular clustering algorithms: | |
| - K-Means | |
| - Hierarchical Clustering | |
| - DBSCAN | |
| - Gaussian Mixture Models (GMM) | |
| - **Model Evaluation**: Uses the elbow method and silhouette scores to determine the optimal number of clusters and evaluate performance. | |
| - **Comprehensive Visualization**: Generates plots to visualize the clusters, compare algorithm performance, and analyze the optimal 'k'. | |
| ## How to Use | |
| 1. **Clone the repository:** | |
| ```bash | |
| git clone https://github.com/GruheshKurra/ClusteringAlgorithms.git | |
| ``` | |
| 2. **Install dependencies:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Run the notebook:** | |
| Open and run the `implementation.ipynb` notebook in a Jupyter environment to see the full analysis. | |
| ## License | |
| This project is licensed under the MIT License. |