# Clustering Algorithms for Customer Segmentation This repository contains a comprehensive implementation of various clustering algorithms to perform customer segmentation on a synthetic dataset. The project explores K-Means, Hierarchical Clustering, DBSCAN, and Gaussian Mixture Models (GMM) to identify distinct customer groups based on age and income. ## Project Structure - `implementation.ipynb`: The main Jupyter notebook containing the entire analysis, from data generation to model evaluation and visualization. - `data/`: Contains the synthetic `customer_data.csv` file. - `models/`: Stores the trained clustering models and the data scaler. - `results/`: Includes the algorithm comparison, detailed analysis, and experiment summary. - `visualizations/`: Contains the output plots, such as the elbow method analysis and cluster comparisons. ## Features - **Data Generation**: A synthetic customer dataset is generated with clear cluster structures for effective model training and evaluation. - **Multiple Algorithms**: Implements and compares four popular clustering algorithms: - K-Means - Hierarchical Clustering - DBSCAN - Gaussian Mixture Models (GMM) - **Model Evaluation**: Uses the elbow method and silhouette scores to determine the optimal number of clusters and evaluate performance. - **Comprehensive Visualization**: Generates plots to visualize the clusters, compare algorithm performance, and analyze the optimal 'k'. ## How to Use 1. **Clone the repository:** ```bash git clone https://github.com/GruheshKurra/ClusteringAlgorithms.git ``` 2. **Install dependencies:** ```bash pip install -r requirements.txt ``` 3. **Run the notebook:** Open and run the `implementation.ipynb` notebook in a Jupyter environment to see the full analysis. ## License This project is licensed under the MIT License.