File size: 1,361 Bytes
4c91838
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

"""
gaussian_mixture.py

This module defines a GaussianMixture model for clustering, along with a parameter grid for hyperparameter tuning.

Gaussian Mixture Models (GMM) assume that data is generated from a mixture of several Gaussian distributions
with unknown parameters. It's a probabilistic model and can handle clusters of varying sizes and shapes.

Parameters:
    - n_components (int): Number of mixture components (clusters).
    - covariance_type (str): Determines the shape of each cluster.
        - 'full': Each cluster has its own general covariance matrix.
        - 'tied': All clusters share the same covariance matrix.
        - 'diag': Each cluster has its own diagonal covariance matrix.
        - 'spherical': Each cluster has its own single variance.
"""

from sklearn.mixture import GaussianMixture

# Define the GaussianMixture estimator
estimator = GaussianMixture(n_components=3, random_state=42)

# Define the hyperparameter grid for tuning
param_grid = {
    'model__n_components': [2, 3, 4],  # Experiment with 2 to 4 clusters
    'model__covariance_type': ['full', 'tied', 'diag', 'spherical']  # Different shapes for cluster covariance
}

# Default scoring metric
# Note: Silhouette score works better for convex clusters. For GMMs with non-convex clusters, consider other metrics like BIC or AIC.
default_scoring = 'silhouette'